Compare commits

...

713 Commits

Author SHA1 Message Date
1558170293 version: bump up to 3.1.13 2018-03-29 10:28:55 -07:00
c3a14a2b28 semaphore: run release test with v3.1.12
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-29 09:23:15 -07:00
6f75c56c5e etcdserver: Manually backport etcdserver/raft.go tickMu fix to 3.1 2018-03-28 12:40:07 -07:00
908c0f4f98 rafthttp: add missing "peer_sent_failures_total" metrics call
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 12:39:59 -07:00
35c6ea7a67 Documentation/upgrades: backport all upgrade guides
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 12:39:59 -07:00
8eeab582d0 etcdserver: adjust election ticks on restart
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 10:17:30 -07:00
c536205249 etcdserver: make "advanceTicks" method
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 10:05:02 -07:00
2e57d99a2c rafthttp: add "ActivePeers" to "Transport"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 10:02:13 -07:00
2fdc4aa06c version: bump up to 3.1.12+git 2018-03-08 14:17:36 -08:00
918698add7 version: bump up to 3.1.12 2018-03-08 13:01:30 -08:00
00d14cfd03 test: fix etcd-tester flags
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-08 08:24:04 -08:00
6e11a79fd8 *: remove unused env vars
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-08 01:36:54 -08:00
028f99b103 test: fix "functional-tester" build script
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-08 01:29:57 -08:00
290fa0c1be *: sync "functional-tester" from 3.2 branch
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-08 00:57:30 -08:00
0874fcbed4 test: run "functional" tests in 3.1 branch
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-08 00:26:52 -08:00
7a148fee36 Merge pull request #9401 from jpbetz/automated-cherry-pick-of-#9297-origin-release-3.1
Automated cherry pick of #9297
2018-03-07 22:03:14 -08:00
087b9aa3dc mvcc: fix watchable store test for 3.2 cherrypick of #9281 2018-03-07 21:20:32 -08:00
b6373f1625 mvcc: restore unsynced watchers
In case syncWatchersLoop() starts before Restore() is called,
watchers already added by that moment are moved to s.synced by the loop.
However, there is a broken logic that moves watchers from s.synced
to s.uncyned without setting keyWatchers of the watcherGroup.
Eventually syncWatchers() fails to pickup those watchers from s.unsynced
and no events are sent to the watchers, because newWatcherBatch() called
in the function uses wg.watcherSetByKey() internally that requires
a proper keyWatchers value.
2018-03-07 21:20:21 -08:00
4178b75411 hack/scripts-dev: fix indentation in run.sh
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-07 14:31:25 -08:00
9c8e39e7f4 hack/scripts-dev: sync with master
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-07 14:25:10 -08:00
af3021aa1a semaphore: update Go version, release test version
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-07 14:24:57 -08:00
df0b652d6a travis: use docker, sync with master
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-07 14:24:04 -08:00
8e5d62cf1e Merge pull request #8933 from gyuho/release-doc
release-3.1: scripts/build-docker: build both gcr.io and quay.io images
2017-11-30 12:49:51 -08:00
212d801294 scripts/build-docker: build both gcr.io and quay.io images
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-28 15:03:38 -08:00
c2d8d9fd26 version: bump up to 3.1.11+git 2017-11-28 14:43:37 -08:00
960f4604bc version: bump up to v3.1.11 2017-11-28 10:48:24 -08:00
22b67da920 Merge pull request #8902 from jpbetz/automated-cherry-pick-of-#8813-release-3.1
Automated cherry pick of #8813 release 3.1
2017-11-22 11:21:13 -08:00
4b53ab0909 test: Clean agent directories on disk before functional test runs, not after
This is primarily so CI tooling can capture the agent logs after the functional tester runs.
2017-11-21 11:41:57 -08:00
b32ec69f9b vendor: Switch from boltdb v1.3.0 to coreos/bbolt v1.3.1-coreos.3 2017-11-21 11:34:45 -08:00
3ab9894b04 Merge pull request #8806 from jpbetz/automated-cherry-pick-of-#8427-origin-release-3.1-1509570173
Automated cherry pick of #8427 to release 3.1 branch
2017-11-09 18:07:40 -08:00
00d6d4aba7 e2e: test against latest v3.1.x release
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-08 15:11:34 -08:00
88b5e22b73 semaphore: manually pin v3.1.10 for release upgrade test
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-08 12:46:15 -08:00
2bb8278fbf hack/scripts-dev: add Makefile, Dockerfile-test
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-06 14:13:51 -08:00
935c76b8c3 semaphore.sh: fail tests with "(--- FAIL:|leak)"
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-03 10:59:23 -07:00
e83f50ec7c mvcc: sending events after restore
Fixes: #8411
2017-11-02 17:15:35 -07:00
232a81d804 client/integration: use only digits in unix port
Fix https://github.com/coreos/etcd/issues/7558.

Same as https://github.com/coreos/etcd/issues/6959.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-02 13:56:55 -07:00
6ffc32cd06 semaphore.sh: add to release-3.1 branch 2017-11-02 13:56:48 -07:00
f8aeb21c2d version: bump up to v3.1.10+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-02 13:56:16 -07:00
0520cb9304 version: bump up to 3.1.10
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-07-14 09:37:27 -07:00
424e4ae1cc fixtures: add gencerts.sh, generate CRL 2017-07-14 09:37:15 -07:00
a631a80a39 travis.yml: test with Go 1.8.3
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-07-14 08:51:24 -07:00
fc08fd75ee version: bump up to 3.1.9+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-07-14 08:51:13 -07:00
0f4a535c2f version: bump up to 3.1.9
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-09 10:44:38 -07:00
c765bef483 rafthttp: permit very large v2 snapshots
v2 snapshots were hitting the 512MB message decode limit, causing
sending snapshots to new members to fail for being too big.
2017-06-09 10:44:07 -07:00
5586a5806e version: bump up to 3.1.8+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-09 10:43:34 -07:00
d267ca9c18 version: bump up to 3.1.8
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-05 12:25:40 -07:00
4176fe768f *: fix other broken links in markdown
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 22:27:58 -07:00
950c846144 Documentation/v3: fix broken links
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 22:27:51 -07:00
0b78d66abe Documentation/v2: fix broken links
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 16:59:23 -07:00
2d58079626 integration: close accepted connection on stopc path
Connection pausing added another exit condition in the listener
path, causing the bridge to leak connections instead of closing
them when signalled to close. Also adds some additional Close
paranoia.

Fixes #7823
2017-05-03 14:51:47 -07:00
be171fa424 etcdserver: apply() sets consistIndex for any entry type
previously, apply() doesn't set consistIndex for EntryConfChange type.
this causes a misalignment between consistIndex and applied index
where EntryConfChange entry results setting applied index but not consistIndex.

suppose that addMember() is called and leader reflects that change.
1. applied index and consistIndex is now misaligned.
2. a new follower node joined.
3. leader sends the snapshot to follower
	where the applied index is the snapshot metadata index.
4. follower node saves the snapshot and database(includes consistIndex) from leader.
5. restarting follower loads snapshot and database.
6. follower checks snapshot metadata index(same as applied index) and database consistIndex,
	finds them don't match, and then panic.

FIXES #7834
2017-05-03 09:22:48 -07:00
4b60243fc5 etcd-2-1-0-bench: Fix an absolute bare link to resource outside of Documentation dir 2017-05-03 08:32:23 -07:00
2c5d79f49f Docs: replace absolute links with relative ones. 2017-05-03 08:32:15 -07:00
424abca6ac version: bump up to 3.1.7+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 08:31:53 -07:00
43b75072bf version: bump up to 3.1.7
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-27 08:25:50 -07:00
78141fae60 clientv3: set current revision to create rev regardless of CreateNotify
Turns out the optimization to ignore setting the init rev for
current revision watches breaks some ordering assumptions. Since
Watch only returns a channel once it gets a response, it should
bind the revision at the time of the first create response.

Was causing TestWatchReconnInit to fail.
2017-04-25 10:54:39 -07:00
3be37f042e integration: add pause/unpause to client bridge
Resetting connections sometimes isn't enough; need to stop/resume
accepting connections for some tests while keeping the member up.
2017-04-25 10:54:15 -07:00
7c896098d2 clientv3/integration: test watch resume with disconnect before first event 2017-04-25 10:53:58 -07:00
30f4e36de4 clientv3: only update initReq.rev == 0 with creation watch revision
Always updating the initReq.rev on watch create will resume from the wrong
revision if initReq is ever nonzero.
2017-04-25 10:53:37 -07:00
557abbe437 ctlv3: use printer for lease command results
Fixes #7783
2017-04-20 10:39:36 -07:00
4b448c209b version: bump up to 3.1.6+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-20 10:39:18 -07:00
e5b7ee2d03 version: bump up to 3.1.6
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-19 08:28:06 -07:00
a4c5731c38 ctlv3: keep lease as integer in fields printer
Output was giving %!d(string=) instead of the expected lease ID
value.
2017-04-19 08:27:52 -07:00
1f558ae678 integration: test auth API response header revision
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 20:06:14 -07:00
df93627bbb etcdserver: fill-in Auth API Header in apply layer
Replacing "etcdserver: fill a response header in auth RPCs"
The revision should be set at the time of "apply",
not in later RPC layer.

Fix https://github.com/coreos/etcd/issues/7691

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 20:06:08 -07:00
a20295c65b auth: fix race on stopping simple token keeper
run goroutine was resetting a field for no reason and without holding a lock.
This patch cleans up the run goroutine management to make the start/stop path
less racey in general.
2017-04-14 16:52:25 -07:00
9f7bb0df3a etcdserver: let Status() not require authentication
The information that can be obtained with the RPC doesn't need to be
protected.

Fix https://github.com/coreos/etcd/issues/7721
2017-04-13 15:56:26 -07:00
6a805e5222 test: do not run extra static checks on release branch
Things are usually already fixed in master branch
but not worth backporting.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-13 14:44:22 -07:00
38f79fa565 clientv3: fix gofmt warnings
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-13 14:44:22 -07:00
37a502cc88 integration: test requests with valid auth token but disabled auth
etcd was crashing since auth was assuming a token implies auth is enabled.
2017-04-13 14:44:22 -07:00
9be7fc5320 auth: protect simpleToken with single mutex and check if enabled
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.

Fixes #7722
2017-04-13 14:44:16 -07:00
288bccd288 pkg/transport: remove port in Certificate.IPAddresses
etcd passes 'url.URL.Host' to 'SelfCert' which contains
client, peer port. 'net.ParseIP("127.0.0.1:2379")' returns
'nil', and the client on this self-cert will see errors
of '127.0.0.1 because it doesn't contain any IP SANs'

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-05 04:30:26 -07:00
8cb5b48f58 clientv3: test dial timeout is respected when using auth 2017-04-04 14:14:23 -07:00
6538217528 clientv3: respect dial timeout when authenticating
Fixes #7627
2017-04-04 14:12:32 -07:00
e983d6b343 version: bump up to 3.1.5+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-04 14:10:15 -07:00
20490caaf0 version: bump up to 3.1.5
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-27 10:20:28 -07:00
e156746959 raft: use rs.req.Entries[0].Data as the key for deletion in advance()
advance() should use rs.req.Entries[0].Data as the context instead of
req.Context for deletion. Since req.Context is never set, there won't be
any context being deleted from pendingReadIndex; results mem leak.

FIXES #7571
2017-03-24 15:51:39 -07:00
d84bf983cc Dockerfile-release: add nsswitch.conf into image
The file '/etc/nsswitch.conf' is created in order to
take in account '/etc/hosts' entries while resolving
domain names.
2017-03-23 15:20:49 -07:00
b44c6bff9d clientv3: use waitgroup to wait for substream goroutine teardown
When a grpc watch stream is torn down, it will join on its logical substream
goroutines by waiting for each to close a channel. This doesn't guarantee
the substream is fully exited, though, but only about to exit and can be
waiting to resume even after Watch.Close finishes. Instead, use a
waitgroup.Done at the very end of the substream defer.

Fixes #7573
2017-03-23 12:26:32 -07:00
8c3c1b4a9c *: use filepath.Join for files 2017-03-23 09:53:56 -07:00
b478387a59 wal: use path/filepath instead of path
Use the path/filepath package instead of the path package. The
path package assumes slash-separated paths, which doesn't work
on Windows. But path/filepath manipulates filename paths in a way
that's compatible across OSes.
2017-03-23 09:50:41 -07:00
dfc1f21f9d version: bump to 3.1.4+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-23 09:49:51 -07:00
41e52ebc22 version: bump to 3.1.4
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-22 09:46:23 -07:00
7bb538d4d4 backend: add FillPercent option 2017-03-21 12:12:32 -07:00
1622782e49 integration: ensure 'StopNotify' on publish error
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-21 12:12:13 -07:00
99b47e0c1e etcdmain: handle StopNotify when ErrStopped aborted publish
Fix https://github.com/coreos/etcd/issues/7512.

If a server starts and aborts due to config error,
it is possible to get stuck in ReadyNotify waits.
This adds select case to get notified on stop channel.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-21 12:10:36 -07:00
350d0cd211 ctlv3: have "protobuf" in output help string instead of "proto"
Fixes #7538
2017-03-20 12:40:25 -07:00
72f37ff79a embed: Clear default initial cluster
NewConfig() should sets initial cluster from name but we should clear it
in the event that another discovery option has been specified.

Fixes #7516
2017-03-18 07:56:18 -07:00
3221454cab etcdserver: remove possibly compacted entry look-up
Fix https://github.com/coreos/etcd/issues/7470.

This patch removes unnecessary term look-up in
'createMergedSnapshotMessage', which can trigger panic
if raft entry at etcdProgress.appliedi got compacted
by subsequent 'MsgSnap' messages--if a follower is
being (in this case, network latency spikes) slow, it
could receive subsequent 'MsgSnap' requests from leader.

etcd server-side 'applyAll' routine and raft's Ready
processing routine becomes asynchronous after raft
entries are persisted. And given that raft Ready routine
takes less time to finish, it is possible that second
'MsgSnap' is being handled, while the slow 'applyAll'
is still processing the first(old) 'MsgSnap'. Then raft
Ready routine can compact the log entries at future
index to 'applyAll'. That is how 'createMergedSnapshotMessage'
tried to look up raft term with outdated etcdProgress.appliedi.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-18 07:56:18 -07:00
4a1bffdbc6 clientv3: close open watch channel if substream is closing on reconnect
If substream is closing but outc is still open while reconnecting, then outc
would only be closed once the watch client would connect or once the watch
client is closed. This was leading to deadlocks in the proxy tests. Instead,
close immediately if the context is canceled.

Fixes #7503
2017-03-18 07:56:18 -07:00
9d9be2bc86 ctlv3: ensure synced member list before printing env vars on member add
In cases of multiple endpoints, it's possible member add would get a its
member list from a member that has not yet recognized the membership
update. Instead, confirm that the member list response is from the
member that acked the member add or from a member that has synced
with the cluster following the member add.

Fixes #7498
2017-03-18 07:56:18 -07:00
e5462f74f1 auth: get rid of deadlocking channel passing scheme in simpleTokenTTL
Cherry-picked from 1b1fabef8f.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-18 07:56:05 -07:00
c68c1d9344 discovery: fix print format
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-17 14:21:57 -07:00
6ed56cd723 auth: nil check AuthInfo when checking admin permissions
If the context does not include auth information, get authinfo will
return a nil auth info and a nil error. This is then passed to
IsAdminPermitted, which would dereference the nil auth info.
2017-03-17 14:21:39 -07:00
a3c6f6bf81 version: bump up to 3.1.3+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-17 14:21:15 -07:00
21fdcc6443 version: bump up to 3.1.3
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-10 09:05:16 -08:00
8d122e7011 etcdmain: SdNotify when gateway, grpc-proxy are ready
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-09 11:35:20 -08:00
ade1d97893 lease: guard 'Lease.itemSet' from concurrent writes
Fix https://github.com/coreos/etcd/issues/7448.

Affected if etcd builds with Go 1.8+.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-08 14:50:06 -08:00
1300189581 gateway: fix the dns discovery method
strip the scheme from the endpoints to have a clean hostname for TCP proxy

Fixes #7452
2017-03-08 14:49:50 -08:00
1971517806 etcdctl: correctly batch revisions in make-mirror
Fixes #7410
2017-03-06 14:55:47 -08:00
d614bb0799 etcdmain: log machine default host after update check
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-06 14:55:31 -08:00
059dc91d4c embed: use machine default host only for default value, 0.0.0.0
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-06 14:55:24 -08:00
5fdbaee761 version: bump up to 3.1.2+git 2017-02-24 10:34:53 -08:00
714e7ec8db version: bump up to 3.1.2 2017-02-22 10:45:48 -08:00
2cdaf6d661 netutil: use ipv4 host by default
Was non-deterministic.
2017-02-22 10:45:38 -08:00
77a51e0dbf pkg/netutil: name GetDefaultInterfaces consistent 2017-02-22 10:45:29 -08:00
d96d3aa0ed netutil: add dualstack to linux_route
in v3.1.0 netutil couldn't get default interface for ipv6only hosts

Fixes #7219
2017-02-22 10:45:19 -08:00
66e7532f57 pkg/netutil: use native byte ordering for route information
Fixes #7199
2017-02-22 10:45:07 -08:00
3eff360e79 pkg/cpuutil: add cpuutil
A package for unsafe cpu-ish things.
2017-02-22 10:44:59 -08:00
1487071966 integration: add 'TestV3HashRestart' 2017-02-22 10:39:49 -08:00
5d62bba9c7 auth: keep old revision in 'NewAuthStore'
When there's no changes yet (right after auth
store initialization), we should commit old revision.

Fix https://github.com/coreos/etcd/issues/7359.
2017-02-22 10:39:28 -08:00
114e293119 integration: test keepalives for short TTLs 2017-02-22 10:38:38 -08:00
1439955536 clientv3: do not set next keepalive time <= now+TTL 2017-02-22 10:38:28 -08:00
2c8ecc7e13 tcpproxy: don't use range variable in reactivate goroutine
Ends up trying to reactivate only the last endpoint.
2017-02-22 10:38:19 -08:00
7b4d622a7e raft: fix read index request for #7331 2017-02-22 10:38:09 -08:00
db8abbd975 version: bump to 3.1.1+git 2017-02-21 17:00:02 -08:00
ac1c7eba21 version: bump up to 3.1.1 2017-02-16 13:53:25 -08:00
9cc6d4852a travis: update for Go 1.7.5 tests 2017-02-16 13:53:05 -08:00
ff7fa9843d clientv3: fix lease keepalive duration 2017-02-16 13:52:27 -08:00
f66138d403 clientv3: fix lease keepalive duration 2017-02-16 12:41:09 -08:00
8c87916f68 auth: add 'setupAuthStore' to tests 2017-02-14 14:39:51 -08:00
9e81b002c4 auth: correct initialization in NewAuthStore()
Because of my own silly mistake, current NewAuthStore() doesn't
initialize authStore in a correct manner. For example, after recovery
from snapshot, it cannot revive the flag of enabled/disabled. This
commit fixes the problem.

Fix https://github.com/coreos/etcd/issues/7165
2017-02-14 13:49:19 -08:00
4962c5cff7 auth: add a test case for recoverying from snapshot
Conflicts:
	auth/store_test.go
2017-02-14 13:48:17 -08:00
e5bf25a3b6 e2e: add cases for defrag and snapshot with authentication 2017-02-14 13:43:11 -08:00
98c60e8faa auth, etcdserver: let maintenance services require root role
This commit lets maintenance services require root privilege. It also
moves AuthInfoFromCtx() from etcdserver to auth pkg for cleaning purpose.
2017-02-14 13:42:31 -08:00
3ac3fa6f3d travis: disable email notifications
Was spamming security@coreos.com
2017-02-14 12:55:02 -08:00
eaa8b9e155 clientv3: test closing client cancels blocking dials 2017-02-14 11:31:51 -08:00
ea2aae464d clientv3: use DialContext
Fixes #7216
2017-02-14 11:31:43 -08:00
776739ebc2 roadmap: update roadmap 2017-01-20 14:21:08 -08:00
a7a8a47ba0 README: remove ACI, update Go version 2017-01-20 14:21:00 -08:00
379f7ae10e op-guide: change grpc-proxy from 'pre' to alpha' 2017-01-20 13:25:21 -08:00
ead2d95914 version: bump to v3.1.0+git 2017-01-20 13:25:04 -08:00
8ba2897a21 version: bump to v3.1.0 2017-01-20 12:42:12 -08:00
bc31e27cb9 documentation: update build documentation 2017-01-20 11:20:54 -08:00
fce20a0b0b test: passed the test script arguments as the test function parameters 2017-01-20 11:15:01 -08:00
f10363fecd etcdctlv3: snapshot restore works with lease key 2017-01-20 10:06:49 -08:00
a7ec6c88fd pkg/flags: fixed prefix checking of the env variables 2017-01-20 09:57:00 -08:00
62c591d223 integration: test STM apply on concurrent deletion 2017-01-20 09:56:42 -08:00
5676226867 clientv3/concurrency: fix rev comparison on concurrent key deletion 2017-01-20 09:56:36 -08:00
898b9e608f Documentation: fix typo s/endpoint-health/endpoint health/ 2017-01-19 20:48:01 -08:00
b84be6b11b NEWS: fix date for v3.1 release 2017-01-19 17:56:35 -08:00
7a12d65528 Documentation: update experimental_apis for v3.1 release 2017-01-19 12:43:32 -08:00
53ac04b118 vendor: update 'golang.org/x/net' 2017-01-18 13:11:48 -08:00
fbcd5375b3 glide: update 'golang.org/x/net' 2017-01-18 13:11:13 -08:00
c2e8d06eec grpcproxy, etcdmain, integration: add close channel to kv proxy
ccache launches goroutines that need to be explicitly stopped.

Fixes #7158
2017-01-18 13:11:03 -08:00
6c8f1986c8 etcdserver: use ReqTimeout for linearized read
Fixes #7136
2017-01-17 17:31:31 -08:00
be9ae300c6 pkg/report: add nil checking for getTimeSeries 2017-01-17 13:23:57 -08:00
9ba3632614 Documentation: document upgrading to v3.1 2017-01-17 13:23:19 -08:00
c2d8b5a9e8 ctlv3: print cluster info after adding new member 2017-01-17 10:23:02 -08:00
0c88795a19 Merge pull request #7151 from gyuho/travis
travis: use Go 1.7.4, drop old env var
2017-01-13 12:55:32 -08:00
21e3418553 travis: use Go 1.7.4, drop old env var
We don't use Go 1.5.x anymore
2017-01-13 11:34:05 -08:00
bb797c1ee9 Merge pull request #7147 from gyuho/pkg/report
pkg/report: add 'Stats' to expose report raw data
2017-01-13 11:17:57 -08:00
304606ab0b Merge pull request #7139 from heyitsanthony/proxy-rlock
grpcproxy/cache: acquire read lock on Get instead of write lock
2017-01-13 11:15:13 -08:00
74bad576ed pkg/report: add 'Stats' to expose report raw data 2017-01-13 10:26:00 -08:00
7dfe503f1c Merge pull request #7148 from heyitsanthony/fix-lease-overlap
clientv3: don't reset stream on keepaliveonce or revoke failure
2017-01-13 10:05:02 -08:00
af51f87ad2 vendor: remove groupcache, add ccache 2017-01-13 10:02:04 -08:00
9fa6c95054 grpcproxy: use ccache for key cache
groupcache needs a write lock and has no way to expire keys; ccache can
do this, though.

Also removes the key count metric, since there's no way to efficiently
calculate it using ccache.
2017-01-13 10:00:57 -08:00
5e3b20e70c clientv3: don't reset stream on keepaliveonce or revoke failure
Would cause the keepalive loop to cancel out.

Fixes #7082
2017-01-13 09:05:23 -08:00
c89eae790d Merge pull request #7110 from mitake/reauth
etcdserver, clientv3: handle a case of expired auth token
2017-01-13 11:57:25 +09:00
432bda4dec Merge pull request #7146 from fanminshi/clientv3_balancer_uses_one_connection
clientv3: fix balancer test logic
2017-01-12 13:51:03 -08:00
6d443ba3f9 clienv3: fix balancer test logic 2017-01-12 13:07:44 -08:00
6ce03389c8 Merge pull request #7138 from gyuho/NEWS
NEWS: add v3.1.0, v3.0.16 + minor fixes
2017-01-12 11:33:13 -08:00
34136a69c8 Merge pull request #7145 from heyitsanthony/warn-ca-ignore
transport: warn on user-provided CA
2017-01-12 11:14:28 -08:00
c23d666328 NEWS: add v3.1.0, v3.0.16 + minor fixes 2017-01-12 11:07:27 -08:00
da8fd18d8e transport: warn on user-provided CA
ServerName is ignored for a user-provided CA for backwards compatibility. This
breaks PKI, so warn it is deprecated.
2017-01-12 09:10:05 -08:00
824277cb3a Merge pull request #7119 from sinsharat/add_load_test_tool
tools: Add etcd 3.0 load test tool refernece
2017-01-11 22:17:57 -08:00
c512839382 tools: Add etcd 3.0 load test tool refernece 2017-01-12 11:35:32 +05:30
d431b64d97 etcdserver, clientv3: handle a case of expired auth token
This commit adds a mechanism of handling a case of expired auth token
to clientv3. If a server returns an error code
grpc.codes.Unauthenticated, newRetryWrapper() tries to get a new token
and use it as an option of PerRPCCredential.

Fixes https://github.com/coreos/etcd/issues/7012
2017-01-12 11:49:02 +09:00
0df543dbb3 Merge pull request #7141 from heyitsanthony/rate-limit-range
benchmark: option to rate limit range benchmark
2017-01-11 15:44:33 -08:00
6e730af65a benchmark: option to rate limit range benchmark 2017-01-11 14:36:46 -08:00
43dd751c47 Merge pull request #7137 from heyitsanthony/display-docs
documentation: display docs.md in github browser
2017-01-11 11:29:29 -08:00
6f801d2ae8 documentation: display docs.md in github browser 2017-01-11 10:37:42 -08:00
925d1d74ce Merge pull request #7133 from gyuho/bench
pkg/report: support 99.9-percentile, change column name
2017-01-10 18:25:03 -08:00
e44d3abc77 pkg/report: support 99.9-percentile, change column name 2017-01-10 18:22:47 -08:00
88bdd8a5d9 Merge pull request #7120 from sttts/sttts-update-ugorji-2
Update ugorji/go with embedded interface support
2017-01-10 13:11:56 -08:00
f0fa5ec507 Merge pull request #7128 from heyitsanthony/etcdctl-make-rootrole
etcdctl: create root role on auth enable if it does not yet exist
2017-01-10 12:22:02 -08:00
b32a8010a7 Merge pull request #7121 from hhkbp2/add-test-case
raft: add RawNode test case for #6866
2017-01-09 23:37:23 -08:00
522232212d Merge pull request #7127 from heyitsanthony/fix-auth-spin
auth: reject empty user name when checking op permissions
2017-01-09 19:11:18 -08:00
16135165c2 raft: add RawNode test case for #6866 2017-01-10 10:55:57 +08:00
d20f23c795 etcdctl: create root role on auth enable if it does not yet exist
Kind of tedious to add the root role when enabling auth; can just add
it automatically.
2017-01-09 16:18:13 -08:00
c39a59c0be auth: reject empty user name when checking op permissions
Passing AuthInfo{} to permission checking was causing an infinite loop
because it would always return an old revision error.

Fixes #7124
2017-01-09 15:53:36 -08:00
5278ea5ed0 integration: add grpc auth testing 2017-01-09 15:53:36 -08:00
8adfc06084 Merge pull request #7118 from hhkbp2/fix-test-case
raft: fix test cases for #7042
2017-01-09 10:34:46 -08:00
4a245a632a vendor: update ugorji/go 2017-01-09 12:13:50 +01:00
7bb768ba34 raft: fix test case for #7042 2017-01-09 16:52:02 +08:00
f99c76cb47 Merge pull request #7113 from heyitsanthony/testutil-bufsize
testutil: increase size of buffer for stack dump
2017-01-06 18:16:42 -08:00
6ab8dcb679 testutil: increase size of buffer for stack dump
Too many goroutines to fit all stack traces in 8kb.
2017-01-06 17:14:42 -08:00
bc2d47118d Merge pull request #7016 from fanminshi/faq_add_meaning_of_etcd
why: add origin of the term etcd
2017-01-06 16:13:34 -06:00
953b0c6ba2 why: add origin of the term etcd
explain the meaning behind the term etcd.
2017-01-06 16:12:20 -06:00
628e83ecc7 Merge pull request #7106 from gyuho/go1.8
integration: use only digits in unix ports
2017-01-06 13:04:35 -08:00
998f8bf291 Merge pull request #7112 from heyitsanthony/expect-debug
expect: EXPECT_DEBUG environment variable
2017-01-06 11:52:26 -08:00
af5b8190d2 Merge pull request #7111 from heyitsanthony/e2e-ctl-trace
e2e: dump stacks on ctlTest timeout
2017-01-06 11:28:56 -08:00
cf382dbe60 expect: EXPECT_DEBUG environment variable
Dump process output to stdout when EXPECT_DEBUG != "".
2017-01-06 11:09:06 -08:00
acfa601075 e2e: dump stack on ctlTest timeout
Figure out which process is blocking for Elect/Lock test timeouts.
2017-01-06 02:03:55 -08:00
6825ffe1a4 integration: use only digits in unix ports
Fix https://github.com/coreos/etcd/issues/6959.
2017-01-05 12:34:54 -08:00
a42b399f4e Merge pull request #7094 from heyitsanthony/fix-duplicate-grant
auth: use quorum get for GetUser/GetRole for mutable operations
2017-01-05 11:28:33 -08:00
5feb4e1027 Merge pull request #7103 from heyitsanthony/proxy-watch-close
grpcproxy: tear down watch when client context is done
2017-01-04 19:04:08 -08:00
fd72ecfe92 Merge pull request #7087 from sinsharat/make_etcd-runner_command_compliant
etcd-runner: make command compliant
2017-01-04 16:33:19 -08:00
e179225f28 grpcproxy: tear down watch when client context is done
If client closes but all watch streams are not canceled, the outstanding
watch will wait until it is canceled, causing watch server to potentially
wait forever to close.

Fixes #7102
2017-01-04 16:23:27 -08:00
154f268031 Merge pull request #7001 from heyitsanthony/etcdctl-doc
etcdctl: tighten up output, reorganize README.md
2017-01-04 13:44:49 -08:00
10d3b81c39 Merge pull request #7093 from gyuho/member
etcdserver: expose ErrMemberNotEnoughStarted
2017-01-04 12:09:29 -08:00
f9f691ef1f auth: use quorum get for GetUser/GetRole for mutable operations
GetUser would not propagate to the minority node, causing TestCtlV2GetRoleUser to
run CreateUser instead of UpdateUser. Instead, use quorum get to fetch the
current state of auth.

Fixes #7069
2017-01-04 11:55:07 -08:00
729dcd51ce Merge pull request #7090 from vimalk78/fix-comactor-resume-leadr-change#7040
etcdserver: resume compactor only if leader
2017-01-04 10:47:44 -08:00
559a82f66e Merge pull request #7097 from heyitsanthony/benchmark-verbose
benchmark: enable grpc error logging on stderr
2017-01-04 10:32:07 -08:00
40ae83beab Merge pull request #7099 from overvenus/patch-1
docs: fix recovery example in recovery.md
2017-01-04 10:16:48 -08:00
37501e2a5d Merge pull request #7092 from xiang90/fix_raft
raft: use status to test node stop
2017-01-04 09:13:11 -08:00
7aeddf6cd7 docs: fix recovery example in recovery.md 2017-01-04 19:41:15 +08:00
d0f301adb7 etcd-runner:add flags in watcher for hardcoded values 2017-01-04 15:17:53 +05:30
b8444d4d35 benchmark: enable grpc error logging on stderr
Lets you see connection errors (e.g., if tls is misconfigured)
2017-01-04 00:26:43 -08:00
5fac6b8d15 etcdserver: resume compactor only if leader 2017-01-04 05:01:14 +05:30
2b5f9e1c6b etcdserver: expose ErrNotEnoughStartedMembers
Fix https://github.com/coreos/etcd/issues/7072.
2017-01-03 15:23:06 -08:00
fc8cd44c72 raft: use status to test node stop
n.Tick() is async. It can be racy when running with n.Stop().

n.Status() is sync and  has a feedback mechnism internally. So there wont be
any race between n.Status() and n.Stop() call.
2017-01-03 15:18:48 -08:00
61064a7be3 Merge pull request #7085 from gyuho/raft-example-snapshot
raftexample: load snapshot when opening WAL
2017-01-03 10:34:13 -08:00
5cb6dd268b etcd-runner: make command compliant 2017-01-03 14:43:58 +05:30
0af1679b61 raftexample: load snapshot when opening WAL
Fix https://github.com/coreos/etcd/issues/7056.
Previously we don't load snapshot when replaying WAL.
2016-12-30 17:28:57 -08:00
24601ca24b Merge pull request #7084 from heyitsanthony/watch-proxy-leak
integration: wait for watch proxy to finish on client close
2016-12-30 12:51:31 -08:00
75441390b6 integration: defer clus.Terminate in watch tests
Common pattern was defer cancel(), but clus.Terminate() at the end of
the test. This appears to lead to a deadlock that is only released
once the context times out, causing inflated test times.
2016-12-30 12:34:04 -08:00
9b5eb1ae5a grpcproxy, etcdmain, integration: return done channel with WatchServer
Makes it possible to synchronously close the watch server.

Fixes #7078
2016-12-30 12:09:48 -08:00
29e14dde0c Merge pull request #7081 from gyuho/timeout-rafthttp
rafthttp: bump up timeout in pipeline test
2016-12-30 10:14:12 -08:00
cbb6ede69d Merge pull request #7067 from fanminshi/rework_coverage_unit_integration
coverage: rework coverage for unit and integration tests
2016-12-30 10:13:07 -08:00
d25f9feb19 rafthttp: bump up timeout in pipeline test
Fix https://github.com/coreos/etcd/issues/6283.

The timeout is too short. It could take more than 10ms
to send when the buffer gets full after 'pipelineBufSize' of
requests.
2016-12-30 09:46:16 -08:00
74e7614759 testutil: whitelist thread created by go cover 2016-12-29 17:19:27 -08:00
d9a3472894 coverage: rework code coverage for unit and integration tests 2016-12-29 17:19:03 -08:00
0dce29ae57 Merge pull request #7077 from fanminshi/consistent_naming
etcdserver: consistent naming in raftReadyHandler
2016-12-29 14:37:46 -08:00
8242049a33 Merge pull request #7076 from fanminshi/fix_e2e_test
e2e: unset ETCDCTL_API env var before running e2e tests
2016-12-29 14:37:25 -08:00
734dd75565 Merge pull request #7075 from gyuho/version-pull
e2e: poll '/version' in release upgrade tests
2016-12-29 11:29:45 -08:00
2a1bae0c2a etcdserver: consistent naming in raftReadyHandler 2016-12-29 11:27:16 -08:00
b741452d03 e2e: unset ETCDCTL_API env var before running u2e tests
existing ETCDCTL_API env var causes e2e to fail some of its tests.  ETCDCTL_API should not be set before e2e tests start.
the tests themselves should set ETCDCTL_API properly.
2016-12-29 11:21:15 -08:00
4e1010c1b9 e2e: poll '/version' in release upgrade tests
Fix https://github.com/coreos/etcd/issues/7065.
2016-12-29 10:52:40 -08:00
67c75606db Merge pull request #7070 from heyitsanthony/fix-lease-race
lease: use atomics for accessing lease expiry
2016-12-28 16:30:08 -08:00
b5cde6b321 lease: use atomics for accessing lease expiry
Demote was racing on expiry when LeaseTimeToLive called Remaining. Replace
with intrinsics since the ordering isn't important, but torn writes are
bad.
2016-12-28 15:44:14 -08:00
1643ed5667 Merge pull request #7071 from heyitsanthony/bump-integration-timeout
test: bump grpcproxy pass timeout to 15m
2016-12-28 15:41:00 -08:00
f876ccb055 test: bump grpcproxy pass timeout to 15m
integration tests have a 15m timeout elsewhere. The lease stress tests
seem to have pushed the running time over 10m on proxy CI, causing
failures from timeout.
2016-12-28 14:56:57 -08:00
12d930b40f Merge pull request #7068 from heyitsanthony/fix-v2-health
v2http: submit QGET in health endpoint if no progress
2016-12-28 14:30:31 -08:00
3519a9784e Merge pull request #7039 from mitake/benchmark-dialtimeout
benchmark: a new option for configuring dial timeout
2016-12-28 13:12:11 -08:00
9690220cd1 Merge pull request #7064 from heyitsanthony/fix-health-perms
etcdctl: treat permission denied as healthy endpoint
2016-12-28 13:04:55 -08:00
e2463569e7 v2http: submit QGET in health endpoint if no progress
Removing the periodic SYNC calls broke the health endpoint since the
raft index stops updating. Instead, don't bother monitoring the
raft index; issue a QGET directly to get a consensus response.

Fixes #6985
2016-12-28 12:20:56 -08:00
46062efa78 e2e: test cluster-health 2016-12-28 12:20:55 -08:00
e63059ec31 Merge pull request #7030 from crandles/grpc-histograms
etcdmain: add '--metrics' option
2016-12-28 12:03:53 -08:00
36b2d3f5eb etcdmain: add --metrics flag for exposing histogram metrics
this adds a new flag, --metrics, that can be used to enable extensive (histogram) metrics.

Fixes #7024
2016-12-28 13:04:52 -05:00
00e00f16bb ctlv3: consider permission denied error to be healthy for endpoints
Relaxes the permission expectations for endpoint health by noting:
* permission denial on linearized reads is always through consensus
* endpoint health means consensus with the cluster through the endpoint

So, there's no need to require permission on a health check key in order
to know whether the endpoint is healthy.

Fixes #7057
2016-12-28 09:13:27 -08:00
b940e0d514 Merge pull request #7042 from petermattis/pmattis/resume-after-heartbeat-resp
raft: resume paused followers on receipt of MsgHeartbeatResp
2016-12-27 21:15:53 -08:00
a662ddefbb benchmark: a new option for configuring dial timeout
Current benchmark doesn't have an option for configuring dial timeout
of gRPC. This commit adds --dial-timeout for the purpose. It is useful
for stopping long sticking benchmarks.
2016-12-28 14:07:43 +09:00
407afc69ed e2e: check etcdctl endpoint health is healthy if denied permission to key 2016-12-27 14:49:52 -08:00
c00084812c Merge pull request #7054 from gyuho/err
etcd-tester: remove unused err var from maxRev
2016-12-27 12:36:48 -08:00
db8b15bf8f etcd-tester: remove unused err var from maxRev 2016-12-27 12:16:43 -08:00
89b18ff1af Merge pull request #7015 from fanminshi/fix_lease_expired_too_soon
lease: force leader to apply its pending committed index for lease op…
2016-12-27 11:26:15 -08:00
2faf72f47c etcdserver: rework update committed index logic 2016-12-27 10:11:40 -08:00
17873f7be8 Merge pull request #7008 from heyitsanthony/fix-dns
retry on resolution failure for advertised peer DNS check
2016-12-27 10:03:01 -08:00
d9b9821551 Merge pull request #7060 from hhkbp2/fix-pre-vote-tests
raft: fix pre-vote tests
2016-12-26 17:42:36 -08:00
920b155f17 raft: fix pre-vote tests 2016-12-26 14:31:59 +08:00
7b7feb46fc leasehttp: buffer error channel to prevent goroutine leak 2016-12-22 14:25:01 -08:00
fef4a79528 lease: force leader to apply its pending committed index for lease operations
suppose a lease granting request from a follower goes through and followed by a lease look up or renewal, the leader might not apply the lease grant request locally. So the leader might not find the lease from the lease look up or renewal request which will result lease not found error. To fix this issue, we force the leader to apply its pending commited index before looking up lease.

FIX #6978
2016-12-22 14:24:38 -08:00
1a8e3cad9a Merge pull request #7053 from gyuho/typo
etcd-tester: fix typo, add endpoint in logs
2016-12-22 13:12:38 -08:00
591bb5e7f6 etcd-tester: fix typo, add endpoint in logs 2016-12-22 12:51:27 -08:00
acbf0fa452 Merge pull request #7041 from m1093782566/raft-safe
raft: make memory storage set method thread safe
2016-12-20 09:14:27 -08:00
e625400f1d raft: resume paused followers on receipt of MsgHeartbeatResp
Previously, paused followers were resumed upon sending a MsgHearbeat.

Fixes #7037
2016-12-20 08:22:09 -05:00
8151d4d0bc raft: make memory storage set method thread safe 2016-12-20 18:48:52 +08:00
d62ce55584 Merge pull request #7027 from gyuho/default-host
embed: only override default advertised client URL if the client listen URL is 0.0.0.0
2016-12-16 18:53:11 -08:00
e58287f026 embed: only override default advertised client URL if the client listen URL is 0.0.0.0 2016-12-16 18:31:04 -08:00
af3451be26 Merge pull request #7018 from gyuho/why
Documentation: add 'why.md'
2016-12-16 15:54:49 -08:00
bef87cc953 Documentation: add 'why.md' 2016-12-16 15:54:03 -08:00
f95f7a3027 Merge pull request #7028 from gyuho/faq
Documentation: add FAQs on membership operation
2016-12-16 15:37:21 -08:00
2f0e82a31e Documentation: add FAQs on membership operation
Copy Anthony's answer from:
https://github.com/coreos/etcd/issues/6103
https://github.com/coreos/etcd/issues/6114
2016-12-16 15:13:40 -08:00
780d2f2a59 etcdctl: tighten up output, reorganize README.md
Documentation was far too repetitive, making it a chore to read and
make changes. All commands are now organized by functionality and all
repetitive bits about return values and output are in a generalized
subsections.

etcdctl's output handling was missing a lot of commands. Similarly,
in many cases an output format could be given but fail to report
an error as expected.
2016-12-16 13:54:20 -08:00
531c3061c1 Merge pull request #7023 from heyitsanthony/lease-freeze
clientv3: fix lease "freezing" on unhealthy cluster
2016-12-16 11:38:22 -08:00
a375e91c66 clientv3: don't reset keepalive stream on grant failure
Was triggering cancelation errors on outstanding KeepAlives if Grant
had to retry.
2016-12-16 10:36:51 -08:00
46bd842db9 clientv3/integration: test lease grant/keepalive with/without failures 2016-12-16 10:36:51 -08:00
87b1d9571f v3api, rpctypes: add ErrTimeoutDueToConnectionLost
Lack of GRPC code was causing this to look like a halting error to the client.
2016-12-16 10:25:35 -08:00
d9e928de7a Merge pull request #7020 from heyitsanthony/etcdctl-migrate-warn
etcdctl: warn when backend takes too long to open on migrate
2016-12-16 09:51:34 -08:00
109577351b Merge pull request #7022 from hongchaodeng/master
docs: explicitly set ETCDCTL_API=3 in recovery.md
2016-12-15 20:39:19 -08:00
fa733e1e9c docs: explicitly set ETCDCTL_API=3 in recovery.md 2016-12-15 20:10:30 -08:00
e71ff361a4 etcdctl: warn when backend takes too long to open on migrate 2016-12-15 18:57:57 -08:00
52e3dc5eb9 Documentation: minor fix nodes -> node 2016-12-15 21:27:52 -05:00
93e303ec71 Merge pull request #7017 from gyuho/faq
dev-guide: add limit.md
2016-12-15 15:45:23 -08:00
a1e572b460 dev-guide: add limit.md 2016-12-15 15:44:21 -08:00
5aeee917a7 Merge pull request #7006 from heyitsanthony/clusterid-split
Documentation: FAQ entry for cluster ID mismatches
2016-12-15 12:43:17 -08:00
14c851c863 Documentation: FAQ entry for cluster ID mismatches 2016-12-15 11:27:24 -08:00
86a43849fb Merge pull request #7010 from dennwc/keepalive-exit-err
clientv3: ensure KeepAlive channel is closed or error is returned
2016-12-15 08:06:36 -08:00
35fd5dc9fc Merge pull request #6903 from mitake/auth-member
protect membership change RPCs with auth
2016-12-15 08:04:31 -08:00
b126e31132 clientv3: better error message for keep alive loop halt 2016-12-15 16:06:27 +02:00
d46b753186 e2e: test cases of protecting membership change with auth 2016-12-15 22:54:20 +09:00
86d7390804 auth, etcdserver: protect membership change operations with auth
This commit protects membership change operations with auth. Only
users that have root role can issue the operations.

Implements https://github.com/coreos/etcd/issues/6899
2016-12-15 22:54:20 +09:00
5183ce0118 clientv3: add test for keep alive loop exit case 2016-12-15 03:02:44 +02:00
e0bcd4d516 clientv3: return error from KeepAlive if corresponding loop exits
after recvKeepAliveLoop exits client might call KeepAlive adding request channel that will not be closed
this fix makes sure that recvKeepAliveLoop is running before adding request to lessor's list and returns error otherwise

Fixes #6922
2016-12-15 03:02:35 +02:00
d8513adf1d Merge pull request #7007 from heyitsanthony/lease-close
clientv3: close Lease on client Close
2016-12-14 16:06:32 -08:00
26a3e9a740 membership: retry for 30s on advertise url check 2016-12-14 15:56:22 -08:00
29c30b2387 etcdserver: retry for 30s on advertise url check 2016-12-14 15:56:22 -08:00
13b05aeff8 netutil: ctx-ize URLStringsEqual
Handles the case where the DNS entry will only be set up after etcd
starts.
2016-12-14 15:46:30 -08:00
246fb29d8a clientv3: close Lease on client Close
Fixes #6987
2016-12-14 12:11:17 -08:00
a9f72ee0d4 Merge pull request #7005 from heyitsanthony/fix-pprof
embed: deep copy user handlers
2016-12-14 12:05:37 -08:00
8f88632218 Merge pull request #6965 from gyuho/faq
Documentation: add more FAQs (follower, leader, sys-require)
2016-12-14 11:51:34 -08:00
626df4d77c Documentation: add more FAQs (follower, leader, sys-require) 2016-12-14 11:36:07 -08:00
cc931a2319 embed: deep copy user handlers
Shallow copy of user handlers leads to a nil map assignment when
enabling pprof. Since the map is being modified, it should probably
be deep copied into the server context, which fixes the crash.
2016-12-14 10:17:32 -08:00
4ca78aa89f Merge pull request #7004 from fbarbeira/patch-3
op-guide/clustering: fix typo
2016-12-14 09:52:37 -08:00
972ef3c92e op-guide/clustering: fix typo 2016-12-14 18:51:30 +01:00
1e60f88786 Merge pull request #6999 from leonliao/patch-1
Documentation: use port 2379 in local cluster guide
2016-12-14 09:29:20 -08:00
cb9277f339 Documentation: use port 2379 in local cluster guide
The port in endpoints should be 2379, instead of 12379.
2016-12-14 15:09:21 +08:00
cdde0368ad Merge pull request #6997 from gyuho/range
auth: improve 'removeSubsetRangePerms' to O(n)
2016-12-13 16:14:37 -08:00
a53175949e auth: improve 'removeSubsetRangePerms' to O(n) 2016-12-13 15:43:23 -08:00
454f1da2f2 Merge pull request #6996 from xiang90/hardware
doc: add hardware section
2016-12-13 12:53:52 -08:00
e3d8ef4cea doc: add hardware section 2016-12-13 12:42:47 -08:00
1a8e78cd55 Merge pull request #6994 from gyuho/etcd-tester-fix-leak
etcd-tester: cancel lease stream; fix OOM panic
2016-12-13 10:44:54 -08:00
301abddc72 etcd-tester: cancel lease stream; fix OOM panic
It was never closing lease keep-alive streams, leaking memory.
Fix OOM panics in etcd-tester (after 1K rounds).
2016-12-13 09:56:30 -08:00
cc37beff35 Merge pull request #6995 from gyuho/etcd-tester-pprof
etcd-tester: add 'enable-pprof' option
2016-12-13 08:02:47 -08:00
4e831810c9 Merge pull request #6993 from cloudaice/build-bug
build: remove dir use -r flag
2016-12-13 07:57:31 -08:00
7d16e7d27e etcd-tester: add 'enable-pprof' option 2016-12-13 05:03:27 -08:00
b294ab13a4 build: remove dir use -r flag 2016-12-13 16:08:50 +08:00
797d826117 Merge pull request #6979 from heyitsanthony/fields-fmt
etcdctl: "fields" output formats
2016-12-12 15:17:12 -08:00
5f3140987e etcdctl: "fields" output formats
Writes out fields from responses in the format "FieldName" : FieldValue. If
FieldValue is a string, it is formatted with %q.
2016-12-12 13:21:20 -08:00
be740dc436 Merge pull request #6975 from gyuho/gopath
*: fix 'gosimple', 'gounused' checks
2016-12-12 11:53:45 -08:00
5b7582365e Merge pull request #6990 from xiang90/faq_m
doc: add faq about missing heartbeat
2016-12-12 11:38:10 -08:00
468187de31 doc: add faq about missing heartbeat 2016-12-12 11:31:17 -08:00
7e74b3f846 grpcproxy: remove unused field 'wbs *watchBroadcasts' 2016-12-12 10:07:14 -08:00
eb8646a381 v3rpc: remove unused 'splitMethodName' function 2016-12-12 10:07:14 -08:00
3512f114e4 e2e: remove unused 'ctlV3GetFailPerm' 2016-12-12 10:07:14 -08:00
b8e09bf849 tools: simplify boolean comparison, remove unused 2016-12-12 10:07:14 -08:00
0c5d1d5641 raft: simplify boolean comparison, remove unused 2016-12-12 10:07:14 -08:00
f3cb93015c integration: simplify boolean comparison in resp.Created 2016-12-12 10:07:14 -08:00
55307d48ac auth: fix gosimple errors 2016-12-12 10:07:14 -08:00
6ec4b9c26a test: exclude '_home' for gosimple, unused 2016-12-12 10:07:14 -08:00
0a15c1b9c6 Merge pull request #6988 from xiang90/faq_m
doc: add faq about apply warning logging
2016-12-12 10:06:32 -08:00
6969369a32 doc: add faq about apply warning logging 2016-12-12 09:58:42 -08:00
20dca1eb80 Merge pull request #6977 from heyitsanthony/move-prof
etcdserver, embed, v2http: move pprof setup to embed
2016-12-09 13:30:19 -08:00
cf60588b27 Merge pull request #6974 from heyitsanthony/interacting-fixup
Documentation: update get examples to be clearer about ranges
2016-12-09 13:08:40 -08:00
2c06def8ca etcdserver, embed, v2http: move pprof setup to embed
Seems like a better place for prof setup since it's not specific to v2.
2016-12-09 12:37:35 -08:00
cb75c40a8b Merge pull request #6973 from sinsharat/make_contributing_url_based
github: make contribution link non-relative
2016-12-09 12:28:07 -08:00
46e63cc14a Merge pull request #6972 from heyitsanthony/bug-report-link
github: make bug reporting link non-relative
2016-12-09 11:15:29 -08:00
d2a6bbd9c6 Documentation: update get examples to be clearer about ranges
Fixes #6966
2016-12-09 10:54:38 -08:00
01c8b25284 github: make contribution link non-relative 2016-12-10 00:03:47 +05:30
f8b480cd6f github: make bug reporting link non-relative
Works when accessed through code browser, blank if accessed via issues/
2016-12-09 10:18:38 -08:00
1e92b7929c Merge pull request #6967 from heyitsanthony/glide-versions
vendor: use version tags if possible
2016-12-08 16:09:41 -08:00
de58a9c733 scripts: use glide update if repo exists in glide.lock 2016-12-08 14:26:29 -08:00
f095334788 vendor: use versions when possible in glide.yaml
Now using tags instead of SHAs
2016-12-08 14:26:08 -08:00
367f513674 Merge pull request #6961 from heyitsanthony/roadmap
ROADMAP: update for 3.2
2016-12-08 13:30:31 -08:00
b713113094 Merge pull request #6962 from gyuho/mispell
grpcproxy: fix minor typo
2016-12-07 18:55:09 -08:00
fcbfff6a00 Merge pull request #6958 from xiang90/reduce_sync
etcdserver: only send v2 sync if ttl keys exist
2016-12-07 18:38:02 -08:00
a98de7efa7 grpcproxy: fix minor typo 2016-12-07 17:08:46 -08:00
69cc9fdd17 Merge pull request #6956 from gyuho/faq
Documentation: add more FAQ questions
2016-12-07 16:25:47 -08:00
7c0ae91d78 Documentation: add more FAQ questions 2016-12-07 16:25:04 -08:00
09252c4e07 ROADMAP: update for 3.2 2016-12-07 16:12:59 -08:00
2f96a68a20 etcdserver: do not send v2 sync if ttl keys do not exist 2016-12-07 14:48:15 -08:00
da3b71b531 Merge pull request #6929 from heyitsanthony/ctx-lease-renew
etcdserver: use context for Renew
2016-12-07 00:05:14 -08:00
96626d0a23 Merge pull request #6957 from coreos/philips-patch-1
Documentation: add blox and chain as users
2016-12-06 20:23:27 -08:00
1bee237acf Documentation: add blox and chain as users 2016-12-06 20:20:40 -08:00
c4e5081562 Merge pull request #6943 from m1093782566/fix-store-test-comments
store: fix store_test.go comments
2016-12-06 16:54:36 -08:00
529806dba1 Merge pull request #6935 from bdarnell/election-test
raft: Fix election "logs converge" test
2016-12-06 16:45:39 -08:00
be1f36d97c v3rpc, etcdserver, leasehttp: ctxize Renew with request timeout
Would retry a few times before returning a not primary error that
the client should never see. Instead, use proper timeouts and
then return a request timeout error on failure.

Fixes #6922
2016-12-06 14:09:57 -08:00
f6042890b7 integration: use RequireLeader for TestV3LeaseFailover
Giving Renew() the default request timeout causes TestV3LeaseFailover
to miss its timing constraints. Since it only needs to wait until the
leader recognizes the leader is lost, use RequireLeader to cancel the
keepalive stream before the request times out.
2016-12-06 14:09:57 -08:00
fdd89df1eb clientv3/integration: test lease keepalive works following quorum loss 2016-12-06 14:09:57 -08:00
cfd10b4bbf Merge pull request #6949 from xiang90/faq
doc: initial faq
2016-12-06 10:08:09 -08:00
58150937c0 doc: initial faq 2016-12-06 08:48:57 -08:00
1b0ffdaff0 Merge pull request #6945 from sttts/sttts-update-ugorji
Update ugorji
2016-12-06 08:05:13 -08:00
9c364efef6 client: update generated ugorji codec 2016-12-06 07:53:47 +01:00
b21731c022 vendor: update ugorji/go 2016-12-06 07:53:47 +01:00
9603d5e31f store: fix store_test.go comments 2016-12-06 09:31:59 +08:00
994e0d2182 Merge pull request #6950 from gyuho/fix-readstatec-deadlock
etcdserver: time out when readStateC is blocking
2016-12-05 16:37:47 -08:00
cbee2b74a3 Merge pull request #6948 from heyitsanthony/fix-metric-deadlock
grpcproxy: fix deadlock in watchbroadcast
2016-12-05 16:17:26 -08:00
3fd1d951f8 etcdserver: time out when readStateC is blocking
Otherwise, it will block forever when the server is overloaded.

Fix https://github.com/coreos/etcd/issues/6891.
2016-12-05 15:34:46 -08:00
91ff6f30b5 grpcproxy: fix deadlock in watchbroadcast
Calling empty() in watchbroadcast methods was trying to
lock the rwmutex when it was already held.

Fixes #6937
2016-12-05 15:06:44 -08:00
2509e7ad2c Merge pull request #6947 from heyitsanthony/grpc-stat-race
grpcproxy: lock store when getting size
2016-12-05 14:30:00 -08:00
8fefd1f471 Merge pull request #6942 from eiipii/eiipiiVersion2ScalaClient
eiipii/etcdhttpclient library added to documentation on external clients
2016-12-05 14:12:47 -08:00
f62ed3d642 Documentation: link added to libraries-and-tools.md with a new v2 Scala
Client
2016-12-05 22:55:17 +01:00
b9b14b15d6 Merge pull request #6946 from heyitsanthony/fix-e2e-getrole
etcdctl: remove GetUser check before mutable commands
2016-12-05 13:34:52 -08:00
62398954e4 grpcproxy: lock store when getting size
Fixes data race in proxy integration tests.
2016-12-05 13:29:57 -08:00
5559a026d7 etcdctl: remove GetUser check before mutable commands
etcdctl was checking if the user exists before applying mutable calls;
if etcdctl contacts a minority member, the member may not know the user
exists on the cluster yet, causing command failure when it should succeed.

If the user does not exist, it will be picked up once the command goes
through raft.

Fixes #6932
2016-12-05 12:12:06 -08:00
2b6ad93036 Merge pull request #6936 from xiang90/put_rate
banchmark: add rate limit
2016-12-05 12:01:15 -08:00
e62e9ce193 benchmark: add rate limit 2016-12-05 09:54:30 -08:00
40f0193c4c Merge pull request #6938 from bdarnell/ispaused
raft: Export Progress.IsPaused
2016-12-03 21:51:09 -08:00
f60a5d6025 raft: Export Progress.IsPaused
CockroachDB would like to use this method for monitoring.
2016-12-04 13:14:08 +08:00
340ba8353c raft: Fix election "logs converge" test
The "logs converge" case in TestLeaderElectionPreVote was incorrectly
passing because some nodes were not actually using the preVoteConfig.
This test case was more complex than its siblings and it was not
verifying what it wanted to verify, so pull it out into a separate test
where everything can be tested more explicitly.

Fixes #6895
2016-12-03 17:29:15 +08:00
d844440ffb Merge pull request #6930 from xiang90/grpc_metrics
grpcproxy: add cache related metrics
2016-12-02 18:30:49 -08:00
0cb680800e grpcproxy: add cache related metrics 2016-12-02 15:29:42 -08:00
1f954dc9f4 Merge pull request #6926 from xiang90/metrics
grpcproxy: add richer metrics for watch
2016-12-02 14:13:43 -08:00
a686c994cd grpcproxy: add richer metrics for watch 2016-12-02 11:13:30 -08:00
f61b4ae5ad Merge pull request #6921 from heyitsanthony/fix-watch-prevkv-test-leak
integration: cancel Watch when TestV3WatchWithPrevKV exits
2016-12-01 15:25:00 -08:00
76bb33781f integration: cancel Watch when TestV3WatchWithPrevKV exits
Missing ctx cancel was causing goroutine leaks for the proxy tests.
2016-12-01 15:08:18 -08:00
9647012cb1 Merge pull request #6920 from endocode/dongsu/sdnotify-go-systemd
vendor: bump go-systemd to v14 to avoid build error
2016-12-01 10:39:40 -08:00
b9e9c9483b Merge pull request #6885 from fanminshi/refractor_lease_checker
etcd-tester: refactor lease checker
2016-12-01 10:11:15 -08:00
5e351956b9 vendor: bump go-systemd to v14 to avoid build error
Bump go-systemd to v14 (48702e0d, 2016-11-14).
Also adjust caller of daemon.SdNotify() to avoid build error, which can
be seen especially when running "go get github.com/coreos/etcd".
2016-12-01 13:26:46 +01:00
5d60482357 Merge pull request #6911 from m1093782566/fix-get-sorted
store: check sorted order in TestStoreGetSorted
2016-11-30 19:49:55 -08:00
4e52b80590 Merge pull request #6916 from heyitsanthony/fix-coalesce-bcast-race
grpcproxy: fix race between coalesce and bcast on nextrev
2016-11-30 19:49:10 -08:00
5f2b5e8b9d store: check sorted order in TestStoreGetSorted 2016-12-01 10:36:23 +08:00
394ab43587 etcd-tester: refactor lease checker
Move few checking logic from lease stresser to lease checker and change connection logic for lease stresser and checker
2016-11-30 17:29:58 -08:00
60908c64a6 grpcproxy: fix race between coalesce and bcast on nextrev
coalesce was locking the target coalesce broadcast object but not the source
broadcast object resulting in a data race on the source's nextrev.
2016-11-30 16:50:29 -08:00
98cd3fddc9 Merge pull request #6907 from heyitsanthony/fix-quota-proxy-failfast
integration: use Range to wait for reboot in quota tests
2016-11-30 16:49:54 -08:00
f1e0525c81 integration: use Range to wait for reboot in quota tests
Proxy client layer ignores call options so Put is always FailFast;
this can lead to connection errors when trying to issue the Put
following restarting the client's target server.
2016-11-30 13:56:30 -08:00
7079bf9a75 Merge pull request #6574 from vimalk78/auth-simpletoken-not-removed#6554
auth/simple_token.go : token not removed when etcdctl session closes …
2016-11-30 11:33:23 -08:00
8eec86f7fb Merge pull request #6888 from fanminshi/use_monotonic_time_for_lease
Use monotonic time in lease
2016-11-29 13:39:01 -08:00
e7f4010cca lease: Use monotonic time in lease
lease uses monotimer to calculate its expiration. In this way, changing system time won't affect in lease expiration.

FIX #6700
2016-11-29 12:31:00 -08:00
cac30beed5 Merge pull request #6906 from heyitsanthony/fix-watchclose-race
grpcproxy: fix race between watch ranges delete() and broadcasts empty()
2016-11-28 16:26:03 -08:00
d680b8b5fb grpcproxy: fix race between watch ranges delete() and broadcasts empty()
Checking empty() wasn't grabbing the broadcasts lock so the race detector
flags it as a data race with coalesce(). Instead, just return the number
of remaining watches following delete() and get rid of empty().
2016-11-28 15:53:41 -08:00
a076510cc1 Merge pull request #6905 from heyitsanthony/client-readme
client: update README about health monitoring
2016-11-28 13:10:57 -08:00
8aa03a5959 Merge pull request #6884 from gyuho/tls
etcdmain: handle TLS in grpc-proxy listener
2016-11-28 12:28:56 -08:00
ad16b63cce client: update README about health monitoring 2016-11-28 12:28:33 -08:00
dfe853ebff auth: add a timeout mechanism to simple token 2016-11-28 17:21:13 +05:30
c31b1ab8d1 Merge pull request #6896 from gyuho/endpoints
clientv3: return copy of endpoints, not pointer
2016-11-23 11:51:48 -08:00
a08103c088 clientv3: return copy of endpoints, not pointer
Fix https://github.com/coreos/etcd/issues/6892.
2016-11-23 11:33:54 -08:00
aea9c6668f Merge pull request #6890 from gyuho/doc
Documentation/op-guide: add notes about 'datasource' in Prometheus
2016-11-22 10:43:30 -08:00
ede51b10f8 op-guide: add notes about Prometheus data source in Grafana 2016-11-22 10:34:41 -08:00
ec5f9bce63 Merge pull request #6886 from fanminshi/fix_dial_grpc
functional-tester: add withBlock() to grpc dial
2016-11-21 11:33:31 -08:00
f7c721b746 Merge pull request #6867 from fanminshi/fix_checking_timeout
etcd-tester: limit max retry backoff delay
2016-11-21 11:20:32 -08:00
2ccba33dd1 functional-tester: add withBlock() to grpc dial
grpc dail withTimeout() only works if withBlock() option is present.
2016-11-21 11:15:12 -08:00
2ac1c4c9ed etcd-tester:limit max retry backoff delay
grpc uses expoential retry if a connection is lost. grpc will sleep base on exponential delay.
if delay is too large, it slows down tester.
2016-11-21 10:58:55 -08:00
ff96769b55 etcdmain: handle TLS in grpc-proxy listener 2016-11-21 10:39:34 -08:00
0326d6fdd3 Merge pull request #6877 from coreos/fix_test
etcd-tester: do not resolve localhost
2016-11-21 09:52:31 -08:00
69470b5e5f Merge pull request #6878 from absolute8511/fix-raftexample-test
raftexample: confState should be saved after apply
2016-11-21 09:51:10 -08:00
d7c98a4695 Merge pull request #6879 from xiang90/raft_test
raft: fix TestNodeProposeAddDuplicateNode
2016-11-20 22:19:44 -08:00
f2eb8560ed raft: fix TestNodeProposeAddDuplicateNode
Only send signal after applying conf change.
Or deadlock might happen if raft node receives
ready without conf change when the test server
is slow.
2016-11-20 21:59:31 -08:00
859142033f Merge pull request #6866 from absolute8511/master
raft: add node should reset the pendingConf state
2016-11-20 21:34:37 -08:00
e6d1ebcc1d raft: use the channel instead of sleep to make test case reliable 2016-11-21 13:30:15 +08:00
bc6f5ad53e raft: fix test case for data race 2016-11-21 10:30:36 +08:00
62bd5477b9 raft: fix test case, should wait config propose applied 2016-11-21 10:10:34 +08:00
16e3ab0f11 raft: test case to check the duplicate add node propose 2016-11-20 16:58:11 +08:00
e8d06d8e4d raftexample: confState should be saved after apply 2016-11-20 16:51:33 +08:00
b1178469be etcd-tester: do not resolve localhost 2016-11-19 18:38:26 -08:00
7e7c7e157e Merge pull request #6873 from heyitsanthony/proxy-v3-watch-canceled-sync
grpcproxy: fix deadlock on watch broadcasts stop
2016-11-18 22:34:35 -08:00
bb4884e957 Merge pull request #6861 from gyuho/grpc-proxy-metrics
etcdmain: add '/metrics' HTTP/1 path to grpc-proxy
2016-11-18 20:03:52 -08:00
a39509ee5b etcdmain: add '/metrics' HTTP/1 path to grpc-proxy 2016-11-18 19:40:06 -08:00
7618fdd1d6 grpcproxy: fix deadlock on watch broadcasts stop
Holding the WatchBroadcasts lock and waiting on donec was
causing a deadlock with the coalesce loop. Was causing
TestV3WatchSyncCancel to hang.
2016-11-18 16:55:26 -08:00
2acf0806fb Merge pull request #6869 from sinsharat/mvcc_remove_unused_restore_method
mvcc: remove unused restore method
2016-11-18 15:52:45 -08:00
c1581732fd Merge pull request #6872 from heyitsanthony/srv-alert
discovery: warn on scheme mismatch
2016-11-18 13:41:34 -08:00
428cb21a3f Merge pull request #6864 from heyitsanthony/watch-doc
Documentation: add grpc gateway watch example
2016-11-18 13:30:16 -08:00
74ae67b835 discovery: warn on scheme mismatch 2016-11-18 13:12:14 -08:00
b7cc698444 version: bump up v3.1.0-rc.1+git 2016-11-18 11:41:29 -08:00
ccf154e706 Documentation: add grpc gateway watch example
Shows how to use watch via grpc gateway.
2016-11-18 11:37:35 -08:00
6d9168a2ec integration: don't expect recv to stop on CloseSend in waitResponse 2016-11-18 11:37:35 -08:00
3d5ba43211 version: bump up v3.1.0-rc.1 2016-11-18 11:16:01 -08:00
7da3019f42 Merge pull request #6862 from gyuho/network-interface
pkg/netutil: get default interface for tc commands
2016-11-18 10:11:59 -08:00
43078d3ced mvcc: remove unused restore method 2016-11-18 23:04:39 +05:30
097cdbd0e4 pkg/netutil: get default interface for tc commands
Fix https://github.com/coreos/etcd/issues/6841.
2016-11-17 22:49:17 -08:00
68b04b7067 Merge pull request #6846 from sinsharat/mvcc_store_restore_timeout_fix
mvcc: store.restore taking too long triggering snapshot cycle fix
2016-11-17 22:06:43 -08:00
456569f45d e2e: add test for v3 watch over grpc gateway 2016-11-17 15:49:58 -08:00
9a20743190 v3rpc: don't close watcher if client closes send
grpc-gateway will CloseSend but still want to receive updates.
2016-11-17 15:33:37 -08:00
4401d88546 raft: add node should reset the pendingConf state
After add node conf proposed twice with the same node id, the pending state is not reset because
the addNode returned without setting the pending state at the second
time and the pending state will always be true unless other conf changed. During this we
can not add any new node because the propose will be ignored since the
pending state is true.
2016-11-17 15:50:13 +08:00
aa2b5aec1b mvcc : Added benchmark for store.resotre 2016-11-17 04:01:15 +05:30
f014cca644 mvcc: TestStoreRestore fix 2016-11-16 16:58:42 +05:30
95fb41a923 mvcc: store.restore taking too long triggering snapshot cycle fix 2016-11-16 16:31:20 +05:30
377f19b003 Merge pull request #6857 from LK4D4/non_block_status
raft: return empty status if node is stopped
2016-11-15 16:44:50 -08:00
7afc490c95 raft: return empty status if node is stopped
If the node is stopped, then Status can hang forever because there is no
event loop to answer. So, just return empty status to avoid deadlocks.

Fix #6855

Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>
2016-11-15 15:45:23 -08:00
e55b8485dd Merge pull request #6856 from heyitsanthony/proxy-lease-fix
grpcproxy: copy range request before storing in cache
2016-11-15 15:43:00 -08:00
1358a9d460 grpcproxy: copy range request before storing in cache
Reused Range requests would have Serialized overwritten with 'true'.

Was failing on TestV3LeaseSwitch.
2016-11-15 14:35:00 -08:00
7c8f13aed7 Merge pull request #6852 from heyitsanthony/fix-proxy-dbarrier
grpcproxy: watch next revision should be start revision when not 0
2016-11-15 09:21:18 -08:00
98a7c642d4 grpcproxy: watch next revision should be start revision when not 0
The create header revision is the current etcd revision. For watches with
rev=0, the next revision is hdr.rev+1. For watches with rev=n, the next
revision should be n.

Fixes TestDoubleBarrier timeouts.
2016-11-14 16:46:02 -08:00
677606da7d Merge pull request #6851 from gyuho/metrics
v3rpc: replace grpc metrics w/ go-grpc-prometheus
2016-11-14 16:09:17 -08:00
7cac755df2 op-guide: update gRPC requests metrics 2016-11-14 15:20:16 -08:00
5e810e30cc v3rpc: replace grpc metrics w/ go-grpc-prometheus
And disable histogram
2016-11-14 15:20:09 -08:00
d073512def Merge pull request #6849 from heyitsanthony/proxy-fix-watch-create
grpcproxy: don't send extra watch create events
2016-11-14 12:13:40 -08:00
90ea3fbadc grpcproxy: do not resend create event after leader loss
Only set CreateNotify if no watch responses have been received.
2016-11-14 10:43:06 -08:00
e40da39143 grpcproxy: only coalesce watchers that have received create response
Current watchers may have nextrev=0; check response count instead.
2016-11-14 09:19:02 -08:00
a2e86c1371 Merge pull request #6842 from heyitsanthony/watch-prevkv
grpcproxy: support prevKV watcher
2016-11-11 16:32:51 -08:00
45bba11f12 Merge pull request #6844 from gyuho/grafana
op-guide: add screenshot to sample Grafana dashboard
2016-11-11 16:30:02 -08:00
625366875d op-guide: add screenshot to sample Grafana dashboard 2016-11-11 16:21:15 -08:00
70fd684843 Merge pull request #6843 from gyuho/docs
Documentation/op-guide: add 'monitoring' guide
2016-11-11 16:08:32 -08:00
6d83590434 Documentation/op-guide: add 'monitoring' guide 2016-11-11 15:22:07 -08:00
6604306398 grpcproxy: support prevKV watcher
Makes all server watchers PrevKV, discards if client watcher is not PrevKV.
2016-11-11 14:22:06 -08:00
3c97e7a475 Merge pull request #6800 from sinsharat/add_benchmark_watch_latency
benchmark: added watch-latency
2016-11-11 12:41:26 -08:00
e5b6324771 benchmark: added watch-latency 2016-11-12 01:08:35 +05:30
b4726a9501 Merge pull request #6822 from heyitsanthony/watch-bcast
grpcproxy: rework watcher organization
2016-11-11 10:50:27 -08:00
5af4de0930 Merge pull request #6840 from gyuho/vendor
*: clean up vendor
2016-11-11 10:15:18 -08:00
395cf7de51 grpcproxy: reject invalid watch ranges 2016-11-11 10:14:35 -08:00
ec459c2185 grpcproxy: rework watcher organization
The single watcher / group watcher distinction limited and
complicated watcher coalescing more than necessary. Reworked:

Each server watcher is represented by a WatchBroadcast, each
client "Watcher" attaches to some WatchBroadcast. WatchBroadcasts
hold all WatchBroadcast instances for a range. WatchRanges holds
all WatchBroadcasts for the proxy.

WatchProxyStreams represent a grpc watch stream between the proxy and
a client. When a client requests a new watcher through its grpc stream,
the ProxyStream will allocate a Watcher and WatchRanges assigns it to
some WatchBroadcast based on its range.

Coalescing is done by WatchBroadcasts when it receives an update
notification from a WatchBroadcast.

Supports leader failure detection so watches on a bad member
can migrate to other members. Coincidentally, Fixes #6303.
2016-11-11 10:14:35 -08:00
4d5a12a248 Merge pull request #6839 from xiang90/ctl_v
etcdctl: etcdctl v3 should print out its API version
2016-11-11 10:00:14 -08:00
4b417da1be Merge pull request #6837 from purpleidea/feat/consturls
embed: Make immutable defaults constant
2016-11-11 09:53:55 -08:00
38ce362629 vendor: clean up, remove unnecessary deps 2016-11-11 09:51:07 -08:00
1f7e88d851 glide: rerun updatedep.sh to clean up 2016-11-11 09:50:46 -08:00
1ef243e436 etcdctl: etcdctl v3 should print out its API version 2016-11-11 09:33:20 -08:00
745cd730a7 embed: Make immutable defaults constant
This changes the two immutable defaults into constants which allows
packages embedding etcd to import them as const! If they are variables,
then you'll fail with "const initializer foo is not a constant".
2016-11-11 07:34:45 -05:00
952eb4fade Merge pull request #6833 from gyuho/news
NEWS: update with v3.0.15
2016-11-10 13:34:16 -08:00
7c0035637d NEWS: update with v3.0.15 2016-11-10 13:29:07 -08:00
ef024049df Merge pull request #6832 from gyuho/vendor
*: update all grpc-related dependencies
2016-11-10 13:28:07 -08:00
b8b72f80f9 *: revendor, update proto files 2016-11-10 12:02:00 -08:00
baa4e4ee56 scripts/genproto: update gogo/protobuf, grpc-gateway 2016-11-10 12:02:00 -08:00
8631f47568 glide: update all grpc-related dependencies 2016-11-10 12:01:45 -08:00
0a8e28524b Merge pull request #6779 from xiang90/watch_clean
etcd-runner: clean up watcher runner
2016-11-10 09:59:08 -08:00
0b78ef8de1 Merge pull request #6831 from xiang90/grpc_proxy_doc
doc: add gRPC proxy start doc
2016-11-10 09:34:38 -08:00
b16c93a885 doc: add gRPC proxy start doc 2016-11-10 09:20:13 -08:00
523a859ad9 etcd-runner: clean up watcher runner 2016-11-10 08:56:19 -08:00
1a25b2ff3e Merge pull request #6781 from gyuho/vendor
scripts/updatedep: work around 'testify/assert', remove 'etcd-top'
2016-11-09 16:17:27 -08:00
55d25f6f4d tools: remove 'etcd-top'
Travis CI breaks because of cgo dependencies on 'etcd-top'.
This can leave outside of project.
2016-11-09 15:59:47 -08:00
5b8300f08b store: type-assert int64 for assert tests 2016-11-09 15:59:47 -08:00
859ac6dfd8 vendor: rerun 'updatedep.sh' script, clean up 2016-11-09 15:59:47 -08:00
0f68810505 glide: remove legacy packages from godep
And remove all legacy packages in glide.yaml on sub-dependency.
They were added when we migrated from godep. glide will handle
it automatically with glide.lock file.
2016-11-09 15:59:47 -08:00
4cf5b76d18 scripts/updatedep: work around 'testify/assert'
'glide vc --no-tests' flag removes 'testify/assert' deps
in v2 client. Until we deprecate v2 tests, just copy the
necessary files as workaround.

And remove '--skip-tests' flags in case we add dependencies
in test files.
2016-11-09 15:59:34 -08:00
ab6b175a2a Merge pull request #6828 from fanminshi/add_not_equal_to_compare
etcdserver, clientv3: add "!=" to txn
2016-11-09 15:27:08 -08:00
c2fd42b556 etcdserver, clientv3: add "!=" to txn
adding != to compare is a requested functionality from a etcd user

FIX #6719
2016-11-09 14:28:36 -08:00
4a1e89150b Merge pull request #6827 from heyitsanthony/proxy-txn-invalidate
grpcproxy: update cache based on txn response
2016-11-09 13:16:48 -08:00
3ed63af51a Merge pull request #6826 from feisan/dev-kvkv
clientv3/naming: support OpOption when adding an endpoint
2016-11-09 13:03:56 -08:00
a4dcceb8aa grpcproxy: update cache based on txn response
Fixes more hangs in TestSTMConflict.
2016-11-09 12:11:38 -08:00
c20d31adc5 clientv3/naming: support OpOption when adding an endpoint
if we want to add an endpoint with lease, we need this option.
for example:

    resp, err := cli.Grant(context.TODO(), 5)
    if err != nil {
        log.Fatal(err)
    }

    err = r.Update(context.TODO(), serviceName, naming.Update{Op:naming.Add, Addr: exposedAddr}, clientv3.WithLease(resp.ID))
    if err != nil {
        log.Fatalf(err)
    }
2016-11-09 15:30:17 -04:00
9c7a0a68e5 Merge pull request #6825 from gyuho/new
etcdserver: increase maxGapBetweenApplyAndCommitIndex
2016-11-09 10:32:51 -08:00
c817df1d32 etcdserver: increase maxGapBetweenApplyAndCommitIndex
This exists to prevent sending too many requests that
would lead into applier falling behind Raft accepting-proposal.

Based on recent benchmarks, etcd was able to process high workloads
(2 million writes with 1K concurrent clients).

The limit 1000 is too conservative to test those high workloads.
2016-11-09 09:44:11 -08:00
dbb692e50f Merge pull request #6820 from gyuho/watcher
mvcc: return -1 for wrong watcher range key >= end
2016-11-08 17:36:19 -08:00
0f5d9f00ad Merge pull request #6808 from fanminshi/functional-tester-compaction-deadline-fix
etcd-tester: increase compaction timeout limit
2016-11-08 17:18:40 -08:00
9dd75a946f clientv3, ctlv3: document range end requirement 2016-11-08 17:02:32 -08:00
396a71ee9e integration: test wrong watcher range 2016-11-08 17:02:32 -08:00
425acb28c4 mvcc: return -1 for wrong watcher range key >= end
Fix https://github.com/coreos/etcd/issues/6819.
2016-11-08 17:02:28 -08:00
107d7b663c etcd-tester: changed compaction timeout calculation
functional tester sometime experiences timeout during compaction phase. I changed the timeout calculation base on number of entries created and deleted.

FIX #6805
2016-11-08 17:00:04 -08:00
a93d8dfe62 Merge pull request #6821 from gyuho/manual
*: fix minor typos, styles
2016-11-08 15:39:26 -08:00
510676fea9 Merge pull request #6816 from heyitsanthony/fix-disconn-cancel
clientv3: let watchers cancel when reconnecting
2016-11-08 13:27:50 -08:00
2955d58776 clientv3/integration: fix minor typos, consistent formatting 2016-11-08 12:37:33 -08:00
754daf918b clustering.md: update minor grammar 2016-11-08 12:34:43 -08:00
1a969ffc52 Merge pull request #6812 from feisan/dev-kvkv
Documentation: fixed  typo
2016-11-08 12:33:06 -08:00
1aeeb38459 clientv3: let watchers cancel when reconnecting 2016-11-08 12:02:17 -08:00
7b5e5eadb1 integration: test canceling a watcher on disconnected stream 2016-11-08 12:02:17 -08:00
b2f8d8c397 Merge pull request #6817 from heyitsanthony/build-tag-etcdtop
etcd-top: make build require -tags pcap
2016-11-08 08:12:22 -08:00
57fb2a2b35 vendor: unvendor gopcap so travis CI works 2016-11-07 16:17:52 -08:00
2af31f99c3 etcd-top: make build require -tags pcap
Fixes travis.
2016-11-07 15:54:40 -08:00
97ac128fef Documentation: fixed typo 2016-11-07 19:14:34 -04:00
c9cc1efb67 Merge pull request #6815 from bdarnell/transfer-non-member
raft: Check promotable() in MsgTimeoutNow handling
2016-11-07 10:33:57 -08:00
2f34547d39 raft: Check promotable() in MsgTimeoutNow handling
If MsgTimeoutNow arrived after a node was removed, the node could start
and win an election, then panic in becomeLeader (see
cockroachdb/cockroach#8535)
2016-11-07 20:02:21 +08:00
ecd4803ccc Merge pull request #6809 from hongchaodeng/doc
readme: add 'run etcd on k8s' section
2016-11-04 22:59:42 -07:00
011c452b65 readme: add 'run etcd on k8s' section 2016-11-04 20:28:14 -07:00
352d4fa3fa Merge pull request #6804 from heyitsanthony/stm-conflict
grpcproxy: invalidate comparison keys after txn
2016-11-04 17:04:16 -07:00
476ff67047 Merge pull request #6807 from xiang90/fix_raft_test
rafttest: make raft test reliable
2016-11-04 16:31:32 -07:00
e5987dea37 rafttest: make raft test reliable 2016-11-04 15:55:17 -07:00
91360e1495 Merge pull request #6806 from gyuho/metrics
v3rpc: add 'active' gRPC streamsGauge
2016-11-04 12:28:55 -07:00
67082e5bd1 v3rpc: add gRPC active streamsGauge 2016-11-04 11:09:20 -07:00
bf08a6142c grpcproxy: invalidate comparison keys after txn
If the txn comparison block makes claims about a key's current
state, then it may say a key has been updated. Future range/txn
operations may expect this update to eventually be propagated through
the cluster and show up in serialized requests. To avoid spinning
forever on txn/serialized range loops, invalidate the comparison keys.
2016-11-04 09:46:43 -07:00
27459425fa Merge pull request #6799 from gyuho/log-output
etcdmain: configurable 'etcd' binary log-output
2016-11-03 14:52:53 -07:00
a0d206c51f Merge pull request #6801 from xiang90/fix_snap_test
etcdserver: make snaptest fail fast
2016-11-03 14:49:42 -07:00
6a0a0a7ea1 etcdserver: make snaptest fail fast 2016-11-03 14:44:08 -07:00
6ffd7e3ed1 etcdmain: configurable 'etcd' binary log-output
Fix https://github.com/coreos/etcd/issues/5449.
2016-11-03 14:18:12 -07:00
aa526cd53d Merge pull request #6798 from gyuho/ctl-doc
etcdctl/ctlv3: clarify 'user add' argument (user:password)
2016-11-03 11:06:40 -07:00
31a6efbc13 etcdctl/ctlv3: clarify 'user add' argument (user:password) 2016-11-03 10:47:45 -07:00
f82aac2fc6 Merge pull request #6797 from fanminshi/lease_checker_println_fix
etcd-tester: fix lease checker logging format.
2016-11-03 10:17:54 -07:00
b7ab5c6384 Merge pull request #6788 from fanminshi/lease_http_eof_fix
etcd-tester: add retry logic on retriving lease info
2016-11-03 10:14:36 -07:00
6968028020 etcd-tester: fix lease checker logging format.
lease checker used a wrong print format for a variable. this change fixes it.
2016-11-03 10:11:00 -07:00
649fe7f2af etcd-tester: add retry logic on retriving lease info
getting lease and keys info through raw rpcs rarely experience error such as EOF. This is considered as a failure and causes tester to clean up.
however, they are just transient problem with temporary connection issue which should not be considered as a testing failure. so we add retry logic in case of transient failure.

FIX #6754
2016-11-03 10:05:06 -07:00
b28b38fb6d Merge pull request #6793 from timothysc/no-ttl
Add a no-ttl flag to etcdctl migrate to discard keys on transform.
2016-11-03 09:00:53 -07:00
c5ac02164d Merge pull request #6794 from xiang90/fix_migration
ctlv3: fix migration
2016-11-03 08:43:30 -07:00
97e96feb1d ctlv3: Add a no-ttl flag to etcdctl migrate to discard keys on transform. 2016-11-03 10:41:54 -05:00
4d2ec2fec1 Merge pull request #6792 from gyuho/leasehttp
leasehttp: use graceful close, add tests, remove TODO
2016-11-02 22:55:06 -07:00
bbc1cdafef Merge pull request #6791 from gyuho/grpc-leader
etcdserver: translate EOF to ErrNoLeader for renew, timetolive
2016-11-02 22:54:46 -07:00
cc304ac03c etcdserver: translate EOF to ErrNoLeader for renew, timetolive
Address https://github.com/coreos/etcd/issues/6754.

In case there are network errors or unexpected EOF errors
in TimeToLive http requests to leader, we translate that into
ErrNoLeader, and expects the client to retry its request.
2016-11-02 22:22:05 -07:00
2fb2b463a3 Merge pull request #6786 from mitake/empty-user
auth, etcdserver: forbid adding a user with empty name
2016-11-02 22:10:58 -07:00
f85701a46f auth, etcdserver: forbid adding a user with empty name 2016-11-03 13:45:39 +09:00
2ba42990ec ctlv3: fix migration 2016-11-02 20:00:07 -07:00
c931f4d164 leasehttp: use graceful close, add tests, remove TODO 2016-11-02 16:33:26 -07:00
378257161f Merge pull request #6789 from heyitsanthony/grpcproxy-close-send
grpcproxy: reliably track rid in watchergroups
2016-11-02 15:29:08 -07:00
fe755b6250 Merge pull request #6748 from sinsharat/client_metric_add_tests
clientv3: added test for client metrics
2016-11-02 15:00:08 -07:00
844378f0a7 Merge pull request #6790 from sinsharat/clientv3_metrics_doc_update
clientv3: updated doc for metric support
2016-11-02 14:56:04 -07:00
195570b621 clientv3: updated doc for metric support 2016-11-03 03:22:59 +05:30
8ec4215279 grpcproxy: reliably track rid in watchergroups
Couldn't find watcher group from rid on server stream close, leading to
the watcher group sending on a closed channel.

Also got rid of send closing the watcher stream if the buffer is full,
this could lead to a send after close while broadcasting to all receivers.
Instead, if a send times out then the server stream is canceled.

Fixes #6739
2016-11-02 14:42:02 -07:00
13acad85b3 clientv3: added test for client metrics 2016-11-03 00:38:29 +05:30
7d777a4a64 Merge pull request #6784 from xiang90/lock_warning
etcdserver: print out warning when waiting for file lock
2016-11-02 10:34:52 -07:00
5b7728f3cb Merge pull request #5994 from gyuho/v2-error
etcdctl/ctlv2: error handling with JSON
2016-11-01 21:51:16 -07:00
9b470ef4c0 etcdctl/ctlv2: error handling with JSON 2016-11-01 20:59:13 -07:00
c33d04fb54 etcdserver: print out warning when waiting for file lock 2016-11-01 17:55:16 -07:00
71bad561e8 Merge pull request #6782 from xiang90/v2store
store: do not modify key during scanning
2016-11-01 15:41:03 -07:00
43045500b2 store: do not modify key during scanning 2016-11-01 14:35:53 -07:00
71a533fec3 Merge pull request #6771 from fanminshi/refactor_short_lived_lease_logic
etcd-tester: refactor checking short lived lease logic
2016-11-01 14:17:43 -07:00
4f60f1b71f Merge pull request #6708 from bluepeppers/leader-sync-deadlock
client: Prevent deadlocks in Sync
2016-11-01 14:11:21 -07:00
8a03c95dd4 etcd-tester: refactor checking short lived lease logic
move the logic of waiting lease expired from stresser to checker
2016-11-01 14:06:22 -07:00
b30dc10812 Merge pull request #6770 from heyitsanthony/fix-grpc
grpcproxy: add SetHeader support to ServerStream
2016-11-01 13:55:07 -07:00
7ef17d3e97 grpcproxy: add SetHeader support to ServerStream
Fixes #6726
2016-11-01 13:28:02 -07:00
4575353693 Merge pull request #6768 from gyuho/wtwt
clientv3/integration: close active connection to get ErrClientConnClosing
2016-11-01 12:37:04 -07:00
0684d8c4c6 clientv3/integration: close active connection to get ErrClientConnClosing
because clientv3.Close won't trigger it any more

clientv3.Close just closes watch client
instead of closing grpc connection
2016-11-01 11:13:33 -07:00
94c804b81a Merge pull request #6766 from fanminshi/stabilization-logic-refractoring
functional-tester: remove stablilization limit
2016-11-01 10:43:00 -07:00
de008c8a4a client: prevent deadlock in Sync 2016-11-01 17:26:53 +00:00
c781f30ed5 functional-tester: remove stablilization limit
This change removes the waiting needed to ensure the cluster to be stable.

FIX #6760
2016-11-01 10:01:59 -07:00
db83736b7b Merge pull request #6774 from mitake/linearizable-password-checking
etcdserver: linearizable password checking at the API layer
2016-11-01 07:51:27 -07:00
fdf433024f etcdserver: linearizable password checking at the API layer
For avoiding a schedule that can cause an inconsistent auth store [1],
password checking must be done in a linearizable manner.

Fixes https://github.com/coreos/etcd/issues/6675 and https://github.com/coreos/etcd/issues/6683

[1] https://github.com/coreos/etcd/issues/6675#issuecomment-255006389
2016-11-01 00:02:33 -07:00
136c02da71 Merge pull request #6738 from gyuho/raft-cleanup
etcdserver: move 'EtcdServer.send' to raft.go
2016-10-31 15:15:08 -07:00
b64de4707d Merge pull request #6724 from johnbazan/ctlv3_add_user_with_password_inline
etcdctl: allow to add a user within one command line
2016-10-31 14:31:09 -07:00
d51a7dba43 etcdctl: Adding e2e tests for userAddTest 2016-10-31 18:14:29 -03:00
73b4a58ac0 etcdctl: allow to add a user within one command line
This makes the "user add usr:pwd" feature available for ctlv3
without asking for the password in a new prompt.
2016-10-31 18:14:19 -03:00
4969a0e9e7 Merge pull request #6758 from heyitsanthony/move-checker
etcd-tester: refactor stresser / checker management
2016-10-31 13:59:54 -07:00
308f2a1695 etcd-tester: refactor stresser/checker organization
The checkers and stressers should be composable without special cases; this
patch tries to address that while refactoring out some old cruft.

Namely,
* Single stresser/checker for a tester; built from composition
* Composite stresser via comma-separated list of stressers
* Split stressers into separate files
* Removed v2 only flags and special cases
* Rate limiter shared among key stresser and leases stresser
* Composite checker is now concurrent
* Stresser can return a Checker to check its invariants
* Each lease checker only operates on a single lease stresser
2016-10-31 13:59:04 -07:00
72fc5f7d1b Merge pull request #6765 from xiang90/s
etcd-runner: move string generation to pkg/stringutil
2016-10-31 12:37:18 -07:00
9f0ee53e86 etcd-runner: move string generation to pkg/stringutil 2016-10-31 12:20:02 -07:00
30d37b2165 Merge pull request #6763 from gyuho/spell
*: fix minor typos
2016-10-31 10:43:17 -07:00
5bd00ab1f6 *: fix minor typos 2016-10-31 09:47:15 -07:00
a1a2d2b1e7 Merge pull request #6762 from gyuho/doc
op-guide: 'strict-reconfig-check' true by default
2016-10-31 09:44:57 -07:00
7e06a95942 Merge pull request #6759 from xiang90/tester
etcd-runner: refactor code structure and flag cleanup
2016-10-31 09:07:18 -07:00
4a42c72b5e op-guide: 'strict-reconfig-check' true by default 2016-10-31 07:59:33 -07:00
e5c3978725 etcd-runner: refactor code structure and flag cleanup 2016-10-30 18:45:16 -07:00
86c4a74139 etcd-tester: move stresser and checker to tester
These really belong in tester code; the stressers and
checkers are higher order operations that are orchestrated
by the tester. They're not really cluster primitives.
2016-10-29 10:57:17 -07:00
4a08678ce1 Merge pull request #6749 from gyuho/raft-prevote
raft: do not attach term to MsgReadIndex
2016-10-28 22:29:08 -07:00
cb5c92f69b raft: do not attach term to MsgReadIndex
Fix https://github.com/coreos/etcd/issues/6744.

MsgReadIndex, as MsgProp, is to be forwarded to leader.
So we should treat it as local message.
2016-10-28 22:12:25 -07:00
0345226759 Merge pull request #6751 from heyitsanthony/fix-require-leader-test
integration: put key on watch target member for TestWatchWithRequireLeader
2016-10-28 23:24:00 -04:00
bc3e056b4a Merge pull request #6755 from fanminshi/log-statement-fix
functional-tester: fix log statement
2016-10-28 14:40:11 -07:00
34c906be55 functional-tester: fix log statement
simple fix for wrongly printed statement.
2016-10-28 14:27:09 -07:00
d8ea9d22b6 integration: put key on watch target member for TestWatchWithRequireLeader
It's possible the put will not propagate to all members before removing quorum,
causing watches on the key to wait forever.

Fixes #6386
2016-10-28 13:12:26 -04:00
a0360a83c9 Merge pull request #6745 from heyitsanthony/private-recipe
contrib/recipes: unexport and clean up keys.go
2016-10-28 12:44:58 -04:00
8f718e2e5a contrib/recipes: unexport and clean up keys.go
Fixes #6731
2016-10-28 11:41:13 -04:00
c6cd63dc35 Merge pull request #6747 from sinsharat/client_metric_add_example
clientv3: added example for client metrics
2016-10-27 16:42:18 -07:00
a1bfb31219 clientv3: added example for client metrics 2016-10-28 04:30:17 +05:30
ea05711522 Merge pull request #6746 from fanminshi/tester-recovery-error-fix
functional-tester: always clean up if tester encouters an error
2016-10-27 15:22:07 -07:00
7f5a7d1da5 functional-tester: always clean up if tester encouters an error
The current tester doesn't not clean up if any of the failure injection/recovery fails. if tester fails to recover a dead node, tester hangs in the next round because the tester will keep waiting until cluster becomes healthy which is impossible since a node is down. To fix this issue, we will always clean up if any error happens during each round so that cluster will be healthy for next round.

FIX #6743
2016-10-27 15:07:58 -07:00
89107a49fa Merge pull request #6741 from sinsharat/clientv3_add_client_side_metrics
clientv3: added client side metrics support
2016-10-27 13:22:10 -07:00
1b36162659 Merge pull request #6647 from gyuho/watch
clientv3: send create event over outc
2016-10-27 11:45:22 -07:00
0a3d45a307 clientv3: send create event over outc 2016-10-27 11:11:16 -07:00
8fd1dd7862 clientv3: added client side metrics support 2016-10-27 22:47:45 +05:30
c99a9f4075 clientv3: added entries for go-grpc-prometheus for build 2016-10-27 22:44:19 +05:30
2c974abcb9 clientv3: added go-grpc-prometheus for client meterics 2016-10-27 22:36:05 +05:30
6ec03d3f7c etcdserver: move 'EtcdServer.send' to raft.go
Clear 'TODO'
2016-10-26 16:26:00 -07:00
8825392da2 Merge pull request #6714 from fanminshi/short_term_lease_check
functional-tester: add short lived leases checking
2016-10-26 14:50:55 -07:00
8d9e2623e1 functional-tester: add short lived leases checking
lease stresser now generates short lived leases that will expire before invariant checking.
this addition verifies that the expired leases are indeed being deleted on the sever side.
2016-10-26 14:46:57 -07:00
d7c21e6837 Merge pull request #6737 from fanminshi/lease_expired_fix
functional-tester: increase lease TTL
2016-10-26 10:34:38 -07:00
1dc60bb97e functional-tester: increase lease TTL
increasing lease TTL ensure that lease doesn't expire during hashes stabilization period.
I observed that it can take a long time for etcd cluster to become stable.
2016-10-26 10:32:52 -07:00
c58ae95429 Merge pull request #6732 from JoshRosso/etcdctl-specify-ttl-unit
etcdctl: add ttl unit to flag description
2016-10-25 18:08:02 -07:00
e489229153 etcdctl: add ttl unit to flag description
Add the ttl unit (seconds) to --ttl description for etcdctl's mk, mkdir, set,
setdir, update, and updatedir commands.
2016-10-25 17:12:15 -07:00
12e4dfa9c4 Merge pull request #6715 from fanminshi/lease_hash_fix
Lease hash fix
2016-10-25 16:28:11 -07:00
b398233f4f Merge pull request #6730 from gyuho/round-prefix
etcd-runner: fix typo in round prefix
2016-10-25 16:24:27 -07:00
99539ff031 Merge pull request #6727 from sinsharat/etcd-runner_add_client_timeout_flag
etcd-runner: Added connection timeout flag for client
2016-10-25 16:09:10 -07:00
12488d4a70 etcd-runner: fix typo in round prefix 2016-10-25 15:59:44 -07:00
bb97adda0d functional-tester: add retries to hash checking
allows hashes to converge through retrying.
2016-10-25 14:27:36 -07:00
6e7e346c93 etcd-runner: Added client connection timeout flag 2016-10-26 01:58:22 +05:30
d7bc15300b Merge pull request #6624 from bdarnell/pre-vote
raft: Implement the PreVote RPC described in thesis section 9.6
2016-10-25 13:18:22 -07:00
ddc94aaf9e Merge pull request #6725 from gyuho/btree
vendor: backport 'google/btree' changes
2016-10-25 12:34:52 -07:00
4ba63237ce Merge pull request #6705 from gyuho/dump-db
etcd-dump-db: initial commit
2016-10-25 11:43:42 -07:00
cd618323d0 vendor: backport 'google/btree' changes 2016-10-25 10:53:35 -07:00
924ece6ae7 etcd-dump-db: initial commit 2016-10-25 10:18:51 -07:00
4e1d3f0f52 mvcc: expose 'backend.IgnoreKey' 2016-10-25 10:07:08 -07:00
18739e766a Merge pull request #6721 from sinsharat/etcd_runner_remove_unused_code
etcd-runner: remove unused code and change name for randClient
2016-10-25 10:01:20 -07:00
efeceef0ca Merge pull request #6703 from FedericoCeratto/patch-1
Add etcd client for Nim
2016-10-25 09:28:08 -07:00
8d5e969f12 raft: Separate test methods for vote and pre-vote tests 2016-10-25 23:31:44 +09:00
69ae49a1dd etcd-runner: remove unused code and change name for randClient 2016-10-25 19:16:27 +05:30
f2cff42cb8 libraries-and-tools: add Nim client 2016-10-25 12:20:42 +01:00
8e5f34fd97 Merge pull request #6709 from yudai/error_url
clientv3: Fix URL to rpc errors
2016-10-24 18:52:46 -07:00
e7e29ba249 clientv3: Fix URL to rpc errors 2016-10-24 16:38:05 -07:00
c549978b8e Merge pull request #6712 from doodles526/change_boom_to_hey
hack/benchmark: change boom to hey
2016-10-24 14:51:58 -07:00
4eefdaa4bb hack/benchmark: change boom to hey
boom has moved to hey, due to a conflicting binary name with another
project
2016-10-24 13:28:53 -07:00
0bb2384547 Merge pull request #6707 from mkumatag/ppc64le_support
Add ppc64le travis builds
2016-10-24 12:33:21 -07:00
77be124391 Merge pull request #6691 from xiang90/fix_retry
clientv3: do not retry on mutable operations
2016-10-24 10:32:34 -07:00
0642b4e61e travis: Add ppc64le travis builds 2016-10-24 22:46:14 +05:30
a1afb21e33 Merge pull request #6710 from fasaxc/report-cluster-id
client: Return the server's cluster ID as part of the Response
2016-10-24 09:46:57 -07:00
06e2ce116c Merge pull request #6704 from heyitsanthony/proxy-broadcast-race
grpcproxy: fix race on watcher revision
2016-10-24 09:17:29 -07:00
ae99c91903 Merge pull request #6698 from heyitsanthony/session-close
concurrency: terminate session.Close if revoke takes longer than TTL
2016-10-24 09:14:09 -07:00
43df091067 client: Return the server's cluster ID as part of the Response
This allows the client to spot if the cluster ID changes, which
would indicate that the cluster has been rebuilt and watches may be
out of sync.

Helps work around #6652.
2016-10-24 14:51:00 +01:00
22aa710c1f raft: Improve comments and formatting for PreVote change 2016-10-24 22:29:33 +09:00
81f151eed2 clientv3: fix retry logic
1. Balancer should setup gRPC error code correctly for retry.

2. We should not mask context error.
2016-10-22 22:15:43 -07:00
92c987f75d Merge pull request #6695 from sinsharat/watch_runner_respect_rounds
etcd-runner: watcher runner respect rounds
2016-10-21 20:28:29 -07:00
90146d863c etcd-runner: watcher runner respect rounds 2016-10-22 05:00:10 +05:30
cb3a5eaff1 Merge pull request #6702 from heyitsanthony/minmax-proxy
grpcproxy: respect {min,max}{create,mod} revision
2016-10-21 16:29:40 -07:00
a5c93840b4 Merge pull request #6701 from heyitsanthony/compact-resume
integration: account for unsynced server in TestWatchResumeCompacted
2016-10-21 16:29:27 -07:00
f38a5d19a8 concurrency: add WithContext option to sessions
Makes it possible to cancel session requests without having to
close the entire client.
2016-10-21 16:26:59 -07:00
1e330a90c7 concurrency: terminate session.Close if revoke takes longer than TTL
Fixes #6681
2016-10-21 16:21:01 -07:00
bd1985d84b grpcproxy: fix race on watcher revision
Was racing between broadcast setting the watchgroup revision
and joining single watchers.
2016-10-21 16:09:39 -07:00
65eb3038fe grpcproxy: respect {min,max}{create,mod} revision
Mutexes were breaking in proxy integration tests.
2016-10-21 15:02:00 -07:00
8f3abda5b8 integration: account for unsynced server in TestWatchResumeCompacted
The watch's etcd server is shutdown to keep the watch in a retry state as
keys are put and compacted on the cluster. When the server restarts,
there is a window where the compact hasn't been applied which may cause
the watch to receive all events instead of only a compaction error.

Fixes #6535
2016-10-21 13:42:10 -07:00
21e65eec08 Merge pull request #6692 from fanminshi/lease_expire_compact_fix
functional-tester: add rate limiter to lease stresser
2016-10-21 12:47:27 -07:00
d582fdcc1b functional-tester: add rate limiter to lease stresser
too many leases created can cause compaction to timeout. adding a rate limiter limits number of leases and attched keys.
2016-10-21 12:34:49 -07:00
1ad038d02e Merge pull request #6662 from gyuho/db-panic
backend: skip *bolt.DB.Size call when nil
2016-10-21 11:38:54 -07:00
ef9d55800f integration: test inflight Hash call on nil db 2016-10-21 11:02:54 -07:00
994e8e4f40 mvcc: test inflight Hash to trigger Size on nil db 2016-10-21 11:02:09 -07:00
7d30326968 backend: skip *bolt.DB.Size call when nil
Fix https://github.com/coreos/etcdlabs/issues/30.
2016-10-21 11:01:23 -07:00
791aeb39a6 Merge pull request #6653 from gyuho/acbuild
acbuild: add symlinks to /usr/local/bin/etcd*
2016-10-21 10:48:08 -07:00
60c0a5503e Merge pull request #6636 from heyitsanthony/watch-resume-close
clientv3: only receive from closing streams in Watcher close
2016-10-21 10:06:03 -07:00
b72a413b71 Merge pull request #6697 from gyuho/fmt
*: fix gofmt issues with go tip
2016-10-20 17:04:39 -07:00
0626ee048e rafthttp: fix gofmt issues with go tip 2016-10-20 16:32:56 -07:00
46716fe9fb mvcc: fix gofmt issues from Go tip 2016-10-20 16:32:47 -07:00
161eb2c457 Merge pull request #6696 from fanminshi/lease_expire_fix
functional-tester: modify lease renew logic
2016-10-20 16:25:04 -07:00
c100e40715 clientv3: only receive from closing streams in Watcher close
Was overcounting the number of expected closing messages; the resuming
list may have nil entries. Also the full client wasn't closing the watcher
client, only canceling its context, so client closes weren't joining with
the watcher shutdown.

Fixes #6605
2016-10-20 15:33:11 -07:00
a66c25121b integration: stress closing while resuming watchers 2016-10-20 15:33:11 -07:00
a25d4ac821 functional-tester: modify lease renew logic
only renew a lease if the lease is present.
2016-10-20 15:27:46 -07:00
a2cfb56581 Merge pull request #6689 from fanminshi/function-tester-ensure-etcd-fullly-restarted
functional-tester: add logic to ensure etcd node is alive after fault recovery returns
2016-10-20 13:30:48 -07:00
0f1eb14374 Merge pull request #6694 from gyuho/travis
travis: test with Go 1.7.3
2016-10-20 12:04:01 -07:00
dd1920883c Merge pull request #6693 from philips/fix-nb
Documentation: admin guide remove NB
2016-10-20 11:58:32 -07:00
9e6912fe82 travis: test with Go 1.7.3
Go 1.7.3 released.
2016-10-20 11:56:10 -07:00
e719d8641e Merge pull request #6688 from gyuho/compact-rev
e2e: compact with latest rev in alarm test
2016-10-20 11:54:04 -07:00
970abbb60a Documentation: admin guide remove NB
I have no idea what NB means but just change it to Note
2016-10-20 11:47:41 -07:00
9bfbc12d7d e2e: compact with latest rev in alarm test
Fix https://github.com/coreos/etcd/issues/6677.
2016-10-20 11:06:30 -07:00
a47797fdf1 Merge pull request #6690 from hongchaodeng/f
etcdctl: fix migrate in outputing client.Node to json
2016-10-20 10:50:58 -07:00
9205a242b9 clientv3: do not retry on mutable operations 2016-10-20 10:48:10 -07:00
94ea82c00d functional-tester: add logic to ensure etcd node is alive after fault recovery returns
failure recovery needs to wait etcd node to become alive before returning

FIX #6654
2016-10-20 10:31:08 -07:00
b3f0eeabe4 etcdctl: fix migrate in outputing client.Node to json
Using printf will try to parse the string and replace special
characters. In migrate code, we want to just output the raw
json string of client.Node.
For example,
    Printf("%\\") => %!\(MISSING)
    Print("%\\") => %\
Thus, we should use print instead.
2016-10-20 10:03:45 -07:00
6b1b13eabb Merge pull request #6687 from sinsharat/build_add_option_for_binary_stripping
build: add option to enable binaries stripping for windows
2016-10-19 13:25:25 -07:00
4dab78e72c Merge pull request #6680 from sinsharat/etcd_runner_make_run_watcher_fail_safe
etcd-runner: make run watcher fail safe
2016-10-19 13:24:14 -07:00
751a8d5b04 build: add option to enable binaries stripping for windows 2016-10-20 00:52:57 +05:30
50523e22d8 etcd-runner: make run watcher fail safe 2016-10-20 00:23:35 +05:30
e95b571e7c Merge pull request #6684 from gyuho/build-with-strip
release: build binary without symbols for debug
2016-10-19 10:33:50 -07:00
0bd9179835 release: build binary without symbols for debug 2016-10-19 09:45:10 -07:00
28a29d9ecd Merge pull request #6676 from nekto0n/build_args
build: add option to enable binaries stripping
2016-10-19 09:33:22 -07:00
46d4ff823f Merge pull request #6678 from manishrjain/master
raft: Add dgraph to the list of users
2016-10-19 09:31:07 -07:00
401ef96ace Merge pull request #6682 from sinsharat/update_txn_interactive_cmd_output
etcdctlv3: update txn interactive command output
2016-10-19 09:26:40 -07:00
00837b0736 etcdctlv3: update txn interactive command output 2016-10-19 19:55:09 +05:30
cf93a74aa8 raft: Refactor vote handling
Move all vote handling from the per-state step functions to the
top-level Step(). This wasn't necessary before because MsgVote would
cause us to become a follower, but MsgPreVote needs to be handled
without changing the node's current state.
2016-10-19 19:35:21 +08:00
73cae7abd0 raft: Implement the PreVote RPC described in thesis section 9.6
This prevents disruption when a node that has been partitioned
away rejoins the cluster.

Fixes #6522
2016-10-19 19:35:20 +08:00
ca87a13b18 raft: More realistic terms in tests
Some tests were starting nodes with a non-empty log but a term of zero,
which cannot happen in the real world. This was affecting the final term
being tested in TestLeaderElection.
2016-10-19 19:35:20 +08:00
10cead3139 test: Ignore gopath.proto in test script 2016-10-19 19:35:20 +08:00
255670106f raft: Add dgraph to the list of users
Because Dgraph is a notable user of RAFT.
2016-10-19 17:26:51 +11:00
c6ebc13b43 build: build unstripped binaries by default 2016-10-19 11:15:38 +05:00
11c38fb1eb Merge pull request #6661 from manishrjain/startnode
Update README to explain starting a single node cluster and joining it.
2016-10-18 21:03:28 -07:00
e69c2fd382 raft: update README to explain starting a single node cluster and joining it
this PR helps clients of RAFT set up the cluster correctly, when they're
starting with a single node cluster.
2016-10-19 14:09:48 +11:00
c9b7fc46ff Merge pull request #6672 from gyuho/etcdctl-sort-by
*: sort by ASCEND on missing sort order
2016-10-18 17:07:38 -07:00
f550af7ef4 integration: test sort ASCEND by default in range 2016-10-18 16:50:30 -07:00
4de2128344 clientv3/integration: test missing sort order get 2016-10-18 16:29:22 -07:00
3a6d4b7f12 e2e: test sort ASCEND when sort target is missing 2016-10-18 16:29:22 -07:00
1cd6fefd49 etcdserver: set sort ASCEND for empty sort order
when target is not key
2016-10-18 16:29:19 -07:00
20bdb315f5 Merge pull request #6670 from fanminshi/lease_stressor
functional-tester: add lease stresser
2016-10-18 14:38:42 -07:00
ab2b58a80f functional-tester: add lease stresser
Add lease stresser to test lease code under stress and etcd failures

resolve #6380
2016-10-18 14:20:26 -07:00
ed75d93625 Merge pull request #6666 from fanminshi/function-tester-refractor
functional-tester: move checker logic to cluster
2016-10-18 11:44:37 -07:00
7d86d1050e functional-tester: move checker logic to cluster
I move the checker logic from tester to cluster so that stressers and checkers can be initialized at the same time.
this is useful because some checker depends on stressers.
2016-10-18 11:17:40 -07:00
5c60478953 Merge pull request #6656 from yudai/balancer_fast_fail
clientv3: make balancer respect FastFail
2016-10-17 15:04:04 -07:00
6a33f0ffd5 clientv3: make balancer respect FastFail
The simpleBalancer.Get() blocks grpc.Invoke() even when the Invoke() is called
with the FailFast option. Therefore currently any requests with the
FastFail option actually doesn't fail fast. They get blocked when there is
no endpoints available.
Get() method needs to respect the BlockingWait option when
picks up an endpoint address from the list and fail immediately when the option is
enabled and no endpoint is available.
2016-10-17 14:11:51 -07:00
24c284160b Merge pull request #6635 from sinsharat/etcd_runner_add_watcher_runner
etcd-runner:added watch runner
2016-10-17 11:02:06 -07:00
8297322176 etcd-runner:added watch runner 2016-10-17 23:04:33 +05:30
7022d2d00c Merge pull request #6660 from gyuho/delete-all-keys
etcdctl/ctlv3: support del all keys with '--prefix'
2016-10-17 09:57:19 -07:00
75a65e1a70 e2e: add test cases for del all keys 2016-10-17 09:34:21 -07:00
fac20b228d ctlv3: support del all keys by '--prefix' 2016-10-17 09:33:59 -07:00
5457c029d7 Merge pull request #6640 from mitake/bcrypt-async
auth, etcdserver: check password at API layer
2016-10-17 09:24:34 -07:00
39e9b1f75a auth, etcdserver: check password at API layer
The cost of bcrypt password checking is quite high (almost 100ms on a
modern machine) so executing it in apply loop will be
problematic. This commit exclude the checking mechanism to the API
layer. The password checking is validated with the OCC like way
similar to the auth of serializable get.

This commit also removes a unit test of Authenticate RPC from
auth/store_test.go. It is because the RPC now accepts an auth request
unconditionally and delegates the checking functionality to
authStore.CheckPassword() (so a unit test for CheckPassword() is
added). The combination of the two functionalities can be tested by
e2e (e.g. TestCtlV3AuthWriteKey).

Fixes https://github.com/coreos/etcd/issues/6530
2016-10-17 14:18:21 +09:00
cc96f91156 Merge pull request #6659 from kragniz/python-client
Add link to python-etcd3
2016-10-16 15:20:37 -07:00
1e29715185 Documentation: add link to python-etcd3 2016-10-16 20:55:53 +01:00
e1547a775b Merge pull request #6646 from MartyMacGyver/windows_build_cleanup
build: Windows build cleanup
2016-10-14 18:57:36 -07:00
698a789644 Merge pull request #6655 from kragniz/range_end-docs
etcdserver: document DeleteRangeRequest prefixes
2016-10-14 15:00:24 -07:00
f2b953d4f7 version: bump up to v3.1.0-rc.0+git 2016-10-14 14:53:13 -07:00
ce6276a2e8 etcdserver: document DeleteRangeRequest prefixes
There was missing info about deleting prefixes in the proto docs for
DeleteRangeRequest.

Closes #6641.
2016-10-14 21:39:03 +01:00
522be31192 acbuild: add symlinks to /usr/local/bin/etcd*
And uses latest acbuild (v0.4.0, --to-dir flag is deprecated).

For https://github.com/coreos/etcd/issues/6057.
2016-10-14 10:35:26 -07:00
296427fc78 build: Windows build cleanup
Remove spurious warnings and prompts
Make normal output quieter
Add filesys check (warn and exit on FAT* systems)

Fixes #5866
2016-10-14 02:07:53 -07:00
052e314372 build: Windows build cleanup
Remove spurious warnings and prompts; make output more informative

Fixes #5866
2016-10-13 13:05:24 -07:00
840 changed files with 70769 additions and 20147 deletions

View File

@ -5,4 +5,4 @@ A good bug report has some very specific qualities, so please read over our shor
To ask a question, go ahead and ignore this.
[report_bugs]: ../Documentation/reporting_bugs.md
[report_bugs]: https://github.com/coreos/etcd/blob/master/Documentation/reporting_bugs.md

View File

@ -2,4 +2,4 @@
Please read our [contribution workflow][contributing] before submitting a pull request.
[contributing]: ../CONTRIBUTING.md#contribution-flow
[contributing]: https://github.com/coreos/etcd/blob/master/CONTRIBUTING.md#contribution-flow

16
.semaphore.sh Executable file
View File

@ -0,0 +1,16 @@
#!/usr/bin/env bash
TEST_SUFFIX=$(date +%s | base64 | head -c 15)
TEST_OPTS="PASSES='build unit release integration_e2e functional' MANUAL_VER=v3.1.12"
if [ "$TEST_ARCH" == "386" ]; then
TEST_OPTS="GOARCH=386 PASSES='build unit integration_e2e'"
fi
docker run \
--rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd \
gcr.io/etcd-development/etcd-test:go1.8.7 \
/bin/bash -c "${TEST_OPTS} ./test 2>&1 | tee test-${TEST_SUFFIX}.log"
! egrep "(--- FAIL:|panic: test timed out|appears to have leaked)" -B50 -A10 test-${TEST_SUFFIX}.log

View File

@ -1,60 +1,89 @@
dist: trusty
language: go
go_import_path: github.com/coreos/etcd
sudo: false
sudo: required
services: docker
go:
- 1.7.1
- tip
- "1.8.7"
- tip
notifications:
on_success: never
on_failure: never
env:
global:
- GO15VENDOREXPERIMENT=1
matrix:
- TARGET=amd64
- TARGET=arm64
- TARGET=arm
- TARGET=386
- TARGET=amd64
- TARGET=amd64-go-tip
- TARGET=darwin-amd64
- TARGET=windows-amd64
- TARGET=arm64
- TARGET=arm
- TARGET=386
- TARGET=ppc64le
matrix:
fast_finish: true
allow_failures:
- go: tip
- go: tip
env: TARGET=amd64-go-tip
exclude:
- go: "1.8.7"
env: TARGET=amd64-go-tip
- go: tip
env: TARGET=amd64
- go: tip
env: TARGET=darwin-amd64
- go: tip
env: TARGET=windows-amd64
- go: tip
env: TARGET=arm
- go: tip
env: TARGET=arm64
- go: tip
env: TARGET=386
addons:
apt:
packages:
- libpcap-dev
- libaspell-dev
- libhunspell-dev
- go: tip
env: TARGET=ppc64le
before_install:
- go get -v github.com/chzchzchz/goword
- go get -v honnef.co/go/simple/cmd/gosimple
- go get -v honnef.co/go/unused/cmd/unused
- if [[ $TRAVIS_GO_VERSION == 1.* ]]; then docker pull gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION}; fi
# disable godep restore override
install:
- pushd cmd/etcd && go get -t -v ./... && popd
- pushd cmd/etcd && go get -t -v ./... && popd
script:
- echo "TRAVIS_GO_VERSION=${TRAVIS_GO_VERSION}"
- >
case "${TARGET}" in
amd64)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=amd64 ./test"
;;
amd64-go-tip)
GOARCH=amd64 ./test
;;
darwin-amd64)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GO_BUILD_FLAGS='-a -v' GOOS=darwin GOARCH=amd64 ./build"
;;
windows-amd64)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GO_BUILD_FLAGS='-a -v' GOOS=windows GOARCH=amd64 ./build"
;;
386)
GOARCH=386 PASSES="build unit" ./test
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=386 PASSES='build unit' ./test"
;;
*)
# test building out of gopath
GO_BUILD_FLAGS="-a -v" GOPATH="" GOARCH="${TARGET}" ./build
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GO_BUILD_FLAGS='-a -v' GOARCH='${TARGET}' ./build"
;;
esac

View File

@ -5,6 +5,12 @@ ADD etcdctl /usr/local/bin/
RUN mkdir -p /var/etcd/
RUN mkdir -p /var/lib/etcd/
# Alpine Linux doesn't use pam, which means that there is no /etc/nsswitch.conf,
# but Golang relies on /etc/nsswitch.conf to check the order of DNS resolving
# (see https://github.com/golang/go/commit/9dee7771f561cf6aee081c0af6658cc81fac3918)
# To fix this we just create /etc/nsswitch.conf and add the following line:
RUN echo 'hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4' >> /etc/nsswitch.conf
EXPOSE 2379 2380
# Define default command.

57
Dockerfile-test Normal file
View File

@ -0,0 +1,57 @@
FROM ubuntu:16.10
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
RUN apt-get -y update \
&& apt-get -y install \
build-essential \
gcc \
apt-utils \
pkg-config \
software-properties-common \
apt-transport-https \
libssl-dev \
sudo \
bash \
curl \
wget \
tar \
git \
netcat \
libaspell-dev \
libhunspell-dev \
hunspell-en-us \
aspell-en \
shellcheck \
&& apt-get -y update \
&& apt-get -y upgrade \
&& apt-get -y autoremove \
&& apt-get -y autoclean
ENV GOROOT /usr/local/go
ENV GOPATH /go
ENV PATH ${GOPATH}/bin:${GOROOT}/bin:${PATH}
ENV GO_VERSION REPLACE_ME_GO_VERSION
ENV GO_DOWNLOAD_URL https://storage.googleapis.com/golang
RUN rm -rf ${GOROOT} \
&& curl -s ${GO_DOWNLOAD_URL}/go${GO_VERSION}.linux-amd64.tar.gz | tar -v -C /usr/local/ -xz \
&& mkdir -p ${GOPATH}/src ${GOPATH}/bin \
&& go version
RUN mkdir -p ${GOPATH}/src/github.com/coreos/etcd
WORKDIR ${GOPATH}/src/github.com/coreos/etcd
ADD ./scripts/install-marker.sh /tmp/install-marker.sh
RUN go get -v -u -tags spell github.com/chzchzchz/goword \
&& go get -v -u github.com/coreos/license-bill-of-materials \
&& go get -v -u honnef.co/go/tools/cmd/gosimple \
&& go get -v -u honnef.co/go/tools/cmd/unused \
&& go get -v -u honnef.co/go/tools/cmd/staticcheck \
&& go get -v -u github.com/wadey/gocovmerge \
&& go get -v -u github.com/gordonklaus/ineffassign \
&& /tmp/install-marker.sh amd64 \
&& rm -f /tmp/install-marker.sh \
&& curl -s https://codecov.io/bash >/codecov \
&& chmod 700 /codecov

1
Documentation/README.md Symbolic link
View File

@ -0,0 +1 @@
docs.md

View File

@ -14,7 +14,7 @@ GCE n1-highcpu-2 machine type
## Testing
Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
Bootstrap another machine and use the [hey HTTP benchmark tool][hey] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
## Performance
@ -48,5 +48,5 @@ Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send r
| 256 | 64 | all servers | 1033 | 121.5 |
| 256 | 256 | all servers | 3061 | 119.3 |
[boom]: https://github.com/rakyll/boom
[hack-benchmark]: /hack/benchmark/
[hey]: https://github.com/rakyll/hey
[hack-benchmark]: https://github.com/coreos/etcd/tree/master/hack/benchmark

View File

@ -24,7 +24,7 @@ Go OS/Arch: linux/amd64
## Testing
Bootstrap another machine, outside of the etcd cluster, and run the [`boom` HTTP benchmark tool](https://github.com/rakyll/boom) with a connection reuse patch to send requests to each etcd cluster member. See the [benchmark instructions](../../hack/benchmark/) for the patch and the steps to reproduce our procedures.
Bootstrap another machine, outside of the etcd cluster, and run the [`hey` HTTP benchmark tool](https://github.com/rakyll/hey) with a connection reuse patch to send requests to each etcd cluster member. See the [benchmark instructions](../../hack/benchmark/) for the patch and the steps to reproduce our procedures.
The performance is calulated through results of 100 benchmark rounds.
@ -66,4 +66,4 @@ The performance is calulated through results of 100 benchmark rounds.
- Write QPS to cluster leaders seems to be increased by a small margin. This is because the main loop and entry apply loops were decoupled in the etcd raft logic, eliminating several blocks between them.
- Write QPS to all members seems to be increased by a significant margin, because followers now receive the latest commit index sooner, and commit proposals more quickly.
- Write QPS to all members seems to be increased by a significant margin, because followers now receive the latest commit index sooner, and commit proposals more quickly.

View File

@ -24,7 +24,7 @@ Also, we use 3 etcd 2.1.0 alpha-stage members to form cluster to get base perfor
## Testing
Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
Bootstrap another machine and use the [hey HTTP benchmark tool][hey] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
## Performance
@ -66,7 +66,7 @@ Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send r
- write QPS to all servers is increased by 30~80% because follower could receive latest commit index earlier and commit proposals faster.
[boom]: https://github.com/rakyll/boom
[hey]: https://github.com/rakyll/hey
[c7146bd5]: https://github.com/coreos/etcd/commits/c7146bd5f2c73716091262edc638401bb8229144
[etcd-2.1-benchmark]: etcd-2-1-0-alpha-benchmarks.md
[hack-benchmark]: /hack/benchmark/
[hack-benchmark]: ../../hack/benchmark/

View File

@ -39,4 +39,4 @@ The performance is nearly the same as the one with empty server handler.
The performance with empty server handler is not affected by one put. So the
performance downgrade should be caused by storage package.
[etcd-v3-benchmark]: /tools/benchmark/
[etcd-v3-benchmark]: ../../tools/benchmark/

View File

@ -8,6 +8,8 @@ etcd v3 uses [gRPC][grpc] for its messaging protocol. The etcd project includes
The gateway accepts a [JSON mapping][json-mapping] for etcd's [protocol buffer][api-ref] message definitions. Note that `key` and `value` fields are defined as byte arrays and therefore must be base64 encoded in JSON.
Use `curl` to put and get a key:
```bash
<<COMMENT
https://www.base64encode.org/
@ -17,11 +19,24 @@ COMMENT
curl -L http://localhost:2379/v3alpha/kv/put \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"3"}}
curl -L http://localhost:2379/v3alpha/kv/range \
-X POST -d '{"key": "Zm9v"}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"3"},"kvs":[{"key":"Zm9v","create_revision":"2","mod_revision":"2","version":"1","value":"YmFy"}],"count":"1"}
```
Use `curl` to watch a key:
```bash
curl http://localhost:2379/v3alpha/watch \
-X POST -d '{"create_request": {"key":"Zm9v"} }' &
# {"result":{"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"1","raft_term":"2"},"created":true}}
curl -L http://localhost:2379/v3alpha/kv/put \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}' >/dev/null 2>&1
# {"result":{"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"2"},"events":[{"kv":{"key":"Zm9v","create_revision":"2","mod_revision":"2","version":"1","value":"YmFy"}}]}}
```
## Swagger

View File

@ -427,7 +427,7 @@ Empty field.
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the first key to delete in the range. | bytes |
| range_end | range_end is the key following the last key to delete for the range [key, range_end). If range_end is not given, the range is defined to contain only the key argument. If range_end is '\0', the range is all keys greater than or equal to the key argument. | bytes |
| range_end | range_end is the key following the last key to delete for the range [key, range_end). If range_end is not given, the range is defined to contain only the key argument. If range_end is one bit larger than the given key, then the range is all the all keys with the prefix (the given key). If range_end is '\0', the range is all keys greater than or equal to the key argument. | bytes |
| prev_kv | If prev_kv is set, etcd gets the previous key-value pairs before deleting it. The previous key-value pairs will be returned in the delte response. | bool |
@ -762,7 +762,7 @@ From google paxosdb paper: Our implementation hinges around a powerful primitive
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the key to register for watching. | bytes |
| range_end | range_end is the end of the range [key, range_end) to watch. If range_end is not given, only the key argument is watched. If range_end is equal to '\0', all keys greater than or equal to the key argument are watched. | bytes |
| range_end | range_end is the end of the range [key, range_end) to watch. If range_end is not given, only the key argument is watched. If range_end is equal to '\0', all keys greater than or equal to the key argument are watched. If the range_end is one bit larger than the given key, then all keys with the prefix (the given key) will be watched. | bytes |
| start_revision | start_revision is an optional revision to watch from (inclusive). No start_revision is "now". | int64 |
| progress_notify | progress_notify is set so that the etcd server will periodically send a WatchResponse with no events to the new watcher if there are no recent events. It is useful when clients wish to recover a disconnected watcher starting from a recent known revision. The etcd server may decide how often it will send notifications based on current load. | bool |
| filters | filter out put event. filter out delete event. filters filter the events at server side before it sends back to the watcher. | (slice of) FilterType |

View File

@ -978,7 +978,8 @@
"enum": [
"EQUAL",
"GREATER",
"LESS"
"LESS",
"NOT_EQUAL"
],
"default": "EQUAL"
},
@ -1518,7 +1519,7 @@
"range_end": {
"type": "string",
"format": "byte",
"description": "range_end is the key following the last key to delete for the range [key, range_end).\nIf range_end is not given, the range is defined to contain only the key argument.\nIf range_end is '\\0', the range is all keys greater than or equal to the key argument."
"description": "range_end is the key following the last key to delete for the range [key, range_end).\nIf range_end is not given, the range is defined to contain only the key argument.\nIf range_end is one bit larger than the given key, then the range is all\nthe all keys with the prefix (the given key).\nIf range_end is '\\0', the range is all keys greater than or equal to the key argument."
}
}
},

View File

@ -1,8 +1,11 @@
# Experimental APIs and features
For the most part, the etcd project is stable, but we are still moving fast! We believe in the release fast philosophy. We want to get early feedback on features still in development and stabilizing. Thus, there are, and will be more, experimental features and APIs. We plan to improve these features based on the early feedback from the community, or abandon them if there is little interest, in the next few releases. If you are running a production system, please do not rely on any experimental features or APIs.
For the most part, the etcd project is stable, but we are still moving fast! We believe in the release fast philosophy. We want to get early feedback on features still in development and stabilizing. Thus, there are, and will be more, experimental features and APIs. We plan to improve these features based on the early feedback from the community, or abandon them if there is little interest, in the next few releases. Please do not rely on any experimental features or APIs in production environment.
## The current experimental API/features are:
- v3 auth API: expect to be stable in 3.1 release
- etcd gateway: expect to be stable in 3.1 release
- [gateway][gateway]: beta, to be stable in 3.2 release
- [gRPC proxy][grpc-proxy]: alpha, to be stable in 3.2 release
[gateway]: ../op-guide/gateway.md
[grpc-proxy]: ../op-guide/grpc_proxy.md

View File

@ -9,7 +9,7 @@ The etcd client provides a gRPC resolver for resolving gRPC endpoints with an et
```go
import (
"github.com/coreos/etcd/clientv3"
etcdnaming "github.com/coroes/etcd/clientv3/naming"
etcdnaming "github.com/coreos/etcd/clientv3/naming"
"google.golang.org/grpc"
)

View File

@ -51,6 +51,7 @@ Suppose the etcd cluster has stored the following keys:
```bash
foo = bar
foo1 = bar1
foo2 = bar2
foo3 = bar3
```
@ -77,22 +78,38 @@ $ etcdctl get foo --print-value-only
bar
```
Here is the command to range over the keys from `foo` to `foo9`:
Here is the command to range over the keys from `foo` to `foo3`:
```bash
$ etcdctl get foo foo9
$ etcdctl get foo foo3
foo
bar
foo1
bar1
foo2
bar2
```
Note that `foo3` is excluded since the range is over the half-open interval `[foo, foo3)`, excluding `foo3`.
Here is the command to range over all keys prefixed with `foo`:
```bash
$ etcdctl get --prefix foo
foo
bar
foo1
bar1
foo2
bar2
foo3
bar3
```
Here is the command to range over the keys from `foo` to `foo9` limiting the number of results to 2:
Here is the command to range over all keys prefixed with `foo`, limiting the number of results to 2:
```bash
$ etcdctl get foo foo9 --limit 2
$ etcdctl get --prefix --limit=2 foo
foo
bar
foo1
@ -116,29 +133,29 @@ foo1 = bar1_new # revision = 5
Here are an example to access the past versions of keys:
```bash
$ etcdctl get foo foo9 # access the most recent versions of keys
$ etcdctl get --prefix foo # access the most recent versions of keys
foo
bar_new
foo1
bar1_new
$ etcdctl get --rev=4 foo foo9 # access the versions of keys at revision 4
$ etcdctl get --prefix --rev=4 foo # access the versions of keys at revision 4
foo
bar_new
foo1
bar1
$ etcdctl get --rev=3 foo foo9 # access the versions of keys at revision 3
$ etcdctl get --prefix --rev=3 foo # access the versions of keys at revision 3
foo
bar
foo1
bar1
$ etcdctl get --rev=2 foo foo9 # access the versions of keys at revision 2
$ etcdctl get --prefix --rev=2 foo # access the versions of keys at revision 2
foo
bar
$ etcdctl get --rev=1 foo foo9 # access the versions of keys at revision 1
$ etcdctl get --prefix --rev=1 foo # access the versions of keys at revision 1
```
## Read keys which are greater than or equal to the byte value of the specified key
@ -454,4 +471,5 @@ lease 694d5765fc71500b granted with TTL(500s), remaining(132s), attached keys([z
# if the lease has expired or does not exist it will give the below response:
Error: etcdserver: requested lease not found
```
```

View File

@ -0,0 +1,10 @@
# System limits
## Request size limit
etcd is designed to handle small key value pairs typical for metadata. Larger requests will work, but may increase the latency of other requests. For the time being, etcd guarantees to support RPC requests with up to 1MB of data. In the future, the size limit may be loosened or made it configurable.
## Storage size limit
The default storage size limit is 2GB, configurable with `--quota-backend-bytes` flag; supports up to 8GB.

View File

@ -45,7 +45,7 @@ To interact with the started cluster by using etcdctl:
# use API version 3
$ export ETCDCTL_API=3
$ etcdctl --write-out=table --endpoints=localhost:12379 member list
$ etcdctl --write-out=table --endpoints=localhost:2379 member list
+------------------+---------+--------+------------------------+------------------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+------------------+---------+--------+------------------------+------------------------+

View File

@ -3,7 +3,7 @@
etcd uses the [capnslog][capnslog] library for logging application output categorized into *levels*. A log message's level is determined according to these conventions:
* Error: Data has been lost, a request has failed for a bad reason, or a required resource has been lost
* Examples:
* Examples:
* A failure to allocate disk space for WAL
* Warning: (Hopefully) Temporary conditions that may cause errors, but may work fine. A replica disappearing (that may reconnect) is a warning.
@ -26,4 +26,4 @@ etcd uses the [capnslog][capnslog] library for logging application output catego
* Send a normal message to a remote peer
* Write a log entry to disk
[capnslog]: [https://github.com/coreos/pkg/tree/master/capnslog]
[capnslog]: https://github.com/coreos/pkg/tree/master/capnslog

View File

@ -10,29 +10,44 @@ The easiest way to get etcd is to use one of the pre-built release binaries whic
## Build the latest version
For those wanting to try the very latest version, build etcd from the `master` branch.
[Go](https://golang.org/) version 1.6+ (with HTTP2 support) is required to build the latest version of etcd.
etcd vendors its dependency for official release binaries, while making vendoring optional to avoid import conflicts.
[`build` script][build-script] would automatically include the vendored dependencies from [`cmd`][cmd-directory] directory.
For those wanting to try the very latest version, build etcd from the `master` branch. [Go](https://golang.org/) version 1.7+ is required to build the latest version of etcd. To ensure etcd is built against well-tested libraries, etcd vendors its dependencies for official release binaries. However, etcd's vendoring is also optional to avoid potential import conflicts when embedding the etcd server or using the etcd client.
Here are the commands to build an etcd binary from the `master` branch:
First, confirm go 1.7+ is installed:
```
```sh
# go is required
$ go version
go version go1.6 darwin/amd64
go version go1.7.3 darwin/amd64
# GOPATH should be set correctly
$ echo $GOPATH
/Users/example/go
```
$ mkdir -p $GOPATH/src/github.com/coreos
$ cd $GOPATH/src/github.com/coreos
To build `etcd` from the `master` branch without a `GOPATH` using the official `build` script:
```sh
$ git clone https://github.com/coreos/etcd.git
$ cd etcd
$ ./build
$ ./bin/etcd
...
```
To build a vendored `etcd` from the `master` branch via `go get`:
```sh
# GOPATH should be set
$ echo $GOPATH
/Users/example/go
$ go get github.com/coreos/etcd/cmd/etcd
$ $GOPATH/bin/etcd
```
To build `etcd` from the `master` branch without vendoring (may not build due to upstream conflicts):
```sh
# GOPATH should be set
$ echo $GOPATH
/Users/example/go
$ go get github.com/coreos/etcd
$ $GOPATH/bin/etcd
```
## Test the installation

View File

@ -17,6 +17,7 @@ The easiest way to get started using etcd as a distributed key-value store is to
- [gRPC naming and discovery][grpc_naming]
- [Embedding etcd][embed_etcd]
- [Experimental features and APIs][experimental]
- [System limits][system-limit]
## Operating etcd clusters
@ -26,9 +27,10 @@ Administrators who need to create reliable and scalable key-value stores for the
- [Setting up etcd gateways][gateway]
- [Setting up etcd gRPC proxy (pre-alpha)][grpc_proxy]
- [Run etcd clusters inside containers][container]
- [Hardware recommendations][hardware]
- [Configuration][conf]
- [Security][security]
- Monitoring
- [Monitoring][monitoring]
- [Maintenance][maintenance]
- [Understand failures][failures]
- [Disaster recovery][recovery]
@ -40,7 +42,7 @@ Administrators who need to create reliable and scalable key-value stores for the
To learn more about the concepts and internals behind etcd, read the following pages:
- Why etcd (TODO)
- [Why etcd][why] (TODO)
- [Understand data model][data_model]
- [Understand APIs][understand_apis]
- [Glossary][glossary]
@ -50,13 +52,19 @@ To learn more about the concepts and internals behind etcd, read the following p
- [Migrate applications from using API v2 to API v3][v2_migration]
- [Updating v2.3 to v3.0][v3_upgrade]
- [Updating v3.0 to v3.1][v31_upgrade]
## Troubleshooting
## Frequently Asked Questions (FAQ)
Answers to [common questions] about etcd.
[api_ref]: dev-guide/api_reference_v3.md
[api_grpc_gateway]: dev-guide/api_grpc_gateway.md
[clustering]: op-guide/clustering.md
[conf]: op-guide/configuration.md
[system-limit]: dev-guide/limit.md
[common questions]: faq.md
[why]: learning/why.md
[data_model]: learning/data_model.md
[demo]: demo.md
[download_build]: dl_build.md
@ -66,12 +74,14 @@ To learn more about the concepts and internals behind etcd, read the following p
[gateway]: op-guide/gateway.md
[glossary]: learning/glossary.md
[grpc_proxy]: op-guide/grpc_proxy.md
[hardware]: op-guide/hardware.md
[interacting]: dev-guide/interacting_v3.md
[local_cluster]: dev-guide/local_cluster.md
[performance]: op-guide/performance.md
[recovery]: op-guide/recovery.md
[maintenance]: op-guide/maintenance.md
[security]: op-guide/security.md
[monitoring]: op-guide/monitoring.md
[v2_migration]: op-guide/v2-migration.md
[container]: op-guide/container.md
[understand_apis]: learning/api.md
@ -79,3 +89,4 @@ To learn more about the concepts and internals behind etcd, read the following p
[supported_platform]: op-guide/supported-platform.md
[experimental]: dev-guide/experimental_apis.md
[v3_upgrade]: upgrades/upgrade_3_0.md
[v31_upgrade]: upgrades/upgrade_3_1.md

128
Documentation/faq.md Normal file
View File

@ -0,0 +1,128 @@
## Frequently Asked Questions (FAQ)
### etcd, general
#### Do clients have to send requests to the etcd leader?
[Raft][raft] is leader-based; the leader handles all client requests which need cluster consensus. However, the client does not need to know which node is the leader. Any request that requires consensus sent to a follower is automatically forwarded to the leader. Requests that do not require consensus (e.g., serialized reads) can be processed by any cluster member.
### Configuration
#### What is the difference between advertise-urls and listen-urls?
`listen-urls` specifies the local addresses etcd server binds to for accepting incoming connections. To listen on a port for all interfaces, specify `0.0.0.0` as the listen IP address.
`advertise-urls` specifies the addresses etcd clients or other etcd members should use to contact the etcd server. The advertise addresses must be reachable from the remote machines. Do not advertise addresses like `localhost` or `0.0.0.0` for a production setup since these addresses are unreachable from remote machines.
### Deployment
#### System requirements
Since etcd writes data to disk, SSD is highly recommended. To prevent performance degradation or unintentionally overloading the key-value store, etcd enforces a 2GB default storage size quota, configurable up to 8GB. To avoid swapping or running out of memory, the machine should have at least as much RAM to cover the quota. At CoreOS, an etcd cluster is usually deployed on dedicated CoreOS Container Linux machines with dual-core processors, 2GB of RAM, and 80GB of SSD *at the very least*. **Note that performance is intrinsically workload dependent; please test before production deployment**. See [hardware][hardware-setup] for more recommendations.
Most stable production environment is Linux operating system with amd64 architecture; see [supported platform][supported-platform] for more.
#### Why an odd number of cluster members?
An etcd cluster needs a majority of nodes, a quorum, to agree on updates to the cluster state. For a cluster with n members, quorum is (n/2)+1. For any odd-sized cluster, adding one node will always increase the number of nodes necessary for quorum. Although adding a node to an odd-sized cluster appears better since there are more machines, the fault tolerance is worse since exactly the same number of nodes may fail without losing quorum but there are more nodes that can fail. If the cluster is in a state where it can't tolerate any more failures, adding a node before removing nodes is dangerous because if the new node fails to register with the cluster (e.g., the address is misconfigured), quorum will be permanently lost.
#### What is maximum cluster size?
Theoretically, there is no hard limit. However, an etcd cluster probably should have no more than seven nodes. [Google Chubby lock service][chubby], similar to etcd and widely deployed within Google for many years, suggests running five nodes. A 5-member etcd cluster can tolerate two member failures, which is enough in most cases. Although larger clusters provide better fault tolerance, the write performance suffers because data must be replicated across more machines.
#### What is failure tolerance?
An etcd cluster operates so long as a member quorum can be established. If quorum is lost through transient network failures (e.g., partitions), etcd automatically and safely resumes once the network recovers and restores quorum; Raft enforces cluster consistency. For power loss, etcd persists the Raft log to disk; etcd replays the log to the point of failure and resumes cluster participation. For permanent hardware failure, the node may be removed from the cluster through [runtime reconfiguration][runtime reconfiguration].
It is recommended to have an odd number of members in a cluster. An odd-size cluster tolerates the same number of failures as an even-size cluster but with fewer nodes. The difference can be seen by comparing even and odd sized clusters:
| Cluster Size | Majority | Failure Tolerance |
|:-:|:-:|:-:|
| 1 | 1 | 0 |
| 2 | 2 | 0 |
| 3 | 2 | 1 |
| 4 | 3 | 1 |
| 5 | 3 | 2 |
| 6 | 4 | 2 |
| 7 | 4 | 3 |
| 8 | 5 | 3 |
| 9 | 5 | 4 |
Adding a member to bring the size of cluster up to an even number doesn't buy additional fault tolerance. Likewise, during a network partition, an odd number of members guarantees that there will always be a majority partition that can continue to operate and be the source of truth when the partition ends.
#### Does etcd work in cross-region or cross data center deployments?
Deploying etcd across regions improves etcd's fault tolerance since members are in separate failure domains. The cost is higher consensus request latency from crossing data center boundaries. Since etcd relies on a member quorum for consensus, the latency from crossing data centers will be somewhat pronounced because at least a majority of cluster members must respond to consensus requests. Additionally, cluster data must be replicated across all peers, so there will be bandwidth cost as well.
With longer latencies, the default etcd configuration may cause frequent elections or heartbeat timeouts. See [tuning] for adjusting timeouts for high latency deployments.
### Operation
#### How to backup a etcd cluster?
etcdctl provides a `snapshot` command to create backups. See [backup][backup] for more details.
#### Should I add a member before removing an unhealthy member?
When replacing an etcd node, it's important to remove the member first and then add its replacement.
etcd employs distributed consensus based on a quorum model; (n+1)/2 members, a majority, must agree on a proposal before it can be committed to the cluster. These proposals include key-value updates and membership changes. This model totally avoids any possibility of split brain inconsistency. The downside is permanent quorum loss is catastrophic.
How this applies to membership: If a 3-member cluster has 1 downed member, it can still make forward progress because the quorum is 2 and 2 members are still live. However, adding a new member to a 3-member cluster will increase the quorum to 3 because 3 votes are required for a majority of 4 members. Since the quorum increased, this extra member buys nothing in terms of fault tolerance; the cluster is still one node failure away from being unrecoverable.
Additionally, that new member is risky because it may turn out to be misconfigured or incapable of joining the cluster. In that case, there's no way to recover quorum because the cluster has two members down and two members up, but needs three votes to change membership to undo the botched membership addition. etcd will by default reject member add attempts that could take down the cluster in this manner.
On the other hand, if the downed member is removed from cluster membership first, the number of members becomes 2 and the quorum remains at 2. Following that removal by adding a new member will also keep the quorum steady at 2. So, even if the new node can't be brought up, it's still possible to remove the new member through quorum on the remaining live members.
#### Why won't etcd accept my membership changes?
etcd sets `strict-reconfig-check` in order to reject reconfiguration requests that would cause quorum loss. Abandoning quorum is really risky (especially when the cluster is already unhealthy). Although it may be tempting to disable quorum checking if there's quorum loss to add a new member, this could lead to full fledged cluster inconsistency. For many applications, this will make the problem even worse ("disk geometry corruption" being a candidate for most terrifying).
### Performance
#### How should I benchmark etcd?
Try the [benchmark] tool. Current [benchmark results][benchmark-result] are available for comparison.
#### What does the etcd warning "apply entries took too long" mean?
After a majority of etcd members agree to commit a request, each etcd server applies the request to its data store and persists the result to disk. Even with a slow mechanical disk or a virtualized network disk, such as Amazons EBS or Googles PD, applying a request should normally take fewer than 50 milliseconds. If the average apply duration exceeds 100 milliseconds, etcd will warn that entries are taking too long to apply.
Usually this issue is caused by a slow disk. The disk could be experiencing contention among etcd and other applications, or the disk is too simply slow (e.g., a shared virtualized disk). To rule out a slow disk from causing this warning, monitor [backend_commit_duration_seconds][backend_commit_metrics] (p99 duration should be less than 25ms) to confirm the disk is reasonably fast. If the disk is too slow, assigning a dedicated disk to etcd or using faster disk will typically solve the problem.
The second most common cause is CPU starvation. If monitoring of the machines CPU usage shows heavy utilization, there may not be enough compute capacity for etcd. Moving etcd to dedicated machine, increasing process resource isolation cgroups, or renicing the etcd server process into a higher priority can usually solve the problem.
Expensive user requests which access too many keys (e.g., fetching the entire keyspace) can also cause long apply latencies. Accessing fewer than a several hundred keys per request, however, should always be performant.
If none of the above suggestions clear the warnings, please [open an issue][new_issue] with detailed logging, monitoring, metrics and optionally workload information.
#### What does the etcd warning "failed to send out heartbeat on time" mean?
etcd uses a leader-based consensus protocol for consistent data replication and log execution. Cluster members elect a single leader, all other members become followers. The elected leader must periodically send heartbeats to its followers to maintain its leadership. Followers infer leader failure if no heartbeats are received within an election interval and trigger an election. If a leader doesnt send its heartbeats in time but is still running, the election is spurious and likely caused by insufficient resources. To catch these soft failures, if the leader skips two heartbeat intervals, etcd will warn it failed to send a heartbeat on time.
Usually this issue is caused by a slow disk. Before the leader sends heartbeats attached with metadata, it may need to persist the metadata to disk. The disk could be experiencing contention among etcd and other applications, or the disk is too simply slow (e.g., a shared virtualized disk). To rule out a slow disk from causing this warning, monitor [wal_fsync_duration_seconds][wal_fsync_duration_seconds] (p99 duration should be less than 10ms) to confirm the disk is reasonably fast. If the disk is too slow, assigning a dedicated disk to etcd or using faster disk will typically solve the problem.
The second most common cause is CPU starvation. If monitoring of the machines CPU usage shows heavy utilization, there may not be enough compute capacity for etcd. Moving etcd to dedicated machine, increasing process resource isolation with cgroups, or renicing the etcd server process into a higher priority can usually solve the problem.
A slow network can also cause this issue. If network metrics among the etcd machines shows long latencies or high drop rate, there may not be enough network capacity for etcd. Moving etcd members to a less congested network will typically solve the problem. However, if the etcd cluster is deployed across data centers, long latency between members is expected. For such deployments, tune the `heartbeat-interval` configuration to roughly match the round trip time between the machines, and the `election-timeout` configuration to be at least 5 * `heartbeat-interval`. See [tuning documentation][tuning] for detailed information.
If none of the above suggestions clear the warnings, please [open an issue][new_issue] with detailed logging, monitoring, metrics and optionally workload information.
#### What does the etcd warning "request ignored (cluster ID mismatch)" mean?
Every new etcd cluster generates a new cluster ID based on the initial cluster configuration and a user-provided unique `initial-cluster-token` value. By having unique cluster ID's, etcd is protected from cross-cluster interaction which could corrupt the cluster.
Usually this warning happens after tearing down an old cluster, then reusing some of the peer addresses for the new cluster. If any etcd process from the old cluster is still running it will try to contact the new cluster. The new cluster will recognize a cluster ID mismatch, then ignore the request and emit this warning. This warning is often cleared by ensuring peer addresses among distinct clusters are disjoint.
[hardware-setup]: ./op-guide/hardware.md
[supported-platform]: ./op-guide/supported-platform.md
[wal_fsync_duration_seconds]: ./metrics.md#disk
[tuning]: ./tuning.md
[new_issue]: https://github.com/coreos/etcd/issues/new
[backend_commit_metrics]: ./metrics.md#disk
[raft]: https://raft.github.io/raft.pdf
[backup]: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/recovery.md#snapshotting-the-keyspace
[chubby]: http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf
[runtime reconfiguration]: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/runtime-configuration.md
[benchmark]: https://github.com/coreos/etcd/tree/master/tools/benchmark
[benchmark-result]: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/performance.md

View File

@ -0,0 +1,21 @@
# Why etcd
The name "etcd" originated from two ideas, the unix "/etc" folder and "d"istibuted systems. The "/etc" folder is a place to store configuration data for a single system whereas etcd stores configuration information for large scale distributed systems. Hence, a "d"istributed "/etc" is "etcd".
etcd stores metadata in a consistent and fault-tolerant way. Distributed systems use etcd as a consistent key-value store for configuration management, service discovery, and coordinating distributed work. Common distributed patterns using etcd include leader election, [distributed locks][etcd-concurrency], and monitoring machine liveness.
## Use cases
- Container Linux by CoreOS: Application running on [Container Linux][container-linux] gets automatic, zero-downtime Linux kernel updates. Container Linux uses [locksmith] to coordinate updates. locksmith implements a distributed semaphore over etcd to ensure only a subset of a cluster is rebooting at any given time.
- [Kubernetes][kubernetes] stores configuration data into etcd for service discovery and cluster management; etcd's consistency is crucial for correctly scheduling and operating services. The Kubernetes API server persists cluster state into etcd. It uses etcd's watch API to monitor the cluster and roll out critical configuration changes.
## Features and system comparisons
TODO
[etcd-concurrency]: https://godoc.org/github.com/coreos/etcd/clientv3/concurrency
[container-linux]: https://coreos.com/why
[locksmith]: https://github.com/coreos/locksmith
[kubernetes]: http://kubernetes.io/docs/whatisk8s

View File

@ -14,6 +14,7 @@
- [etcdtool](https://github.com/mickep76/etcdtool) - Export/Import/Edit etcd directory as JSON/YAML/TOML and Validate directory using JSON schema
- [etcd-rest](https://github.com/mickep76/etcd-rest) - Create generic REST API in Go using etcd as a backend with validation using JSON schema
- [etcdsh](https://github.com/kamilhark/etcdsh) - A command line client with support of command history and tab completion. Supports v2
- [etcdloadtest](https://github.com/sinsharat/etcdloadtest) - A command line load test client for etcd version 3.0 and above.
**Go libraries**
@ -34,9 +35,11 @@
**Scala libraries**
- [maciej/etcd-client](https://github.com/maciej/etcd-client) - Supports v2. Akka HTTP-based fully async client
- [eiipii/etcdhttpclient](https://bitbucket.org/eiipii/etcdhttpclient) - Supports v2. Async HTTP client based on Netty and Scala Futures.
**Python libraries**
- [kragniz/python-etcd3](https://github.com/kragniz/python-etcd3) - Work in progress client for v3
- [jplana/python-etcd](https://github.com/jplana/python-etcd) - Supports v2
- [russellhaering/txetcd](https://github.com/russellhaering/txetcd) - a Twisted Python library
- [cholcombe973/autodock](https://github.com/cholcombe973/autodock) - A docker deployment automation tool
@ -93,6 +96,10 @@
- [ropensci/etseed](https://github.com/ropensci/etseed)
**Nim libraries**
- [etcd_client](https://github.com/FedericoCeratto/nim-etcd-client)
**Tcl libraries**
- [efrecon/etcd-tcl](https://github.com/efrecon/etcd-tcl) - Supports v2, except wait.
@ -117,7 +124,9 @@
**Projects using etcd**
- [binocarlos/yoda](https://github.com/binocarlos/yoda) - etcd + ZeroMQ
- [blox/blox](https://github.com/blox/blox) - a collection of open source projects for container management and orchestration with AWS ECS
- [calavera/active-proxy](https://github.com/calavera/active-proxy) - HTTP Proxy configured with etcd
- [chain/chain](https://github.com/chain/chain) - software designed to operate and connect to highly scalable permissioned blockchain networks
- [derekchiang/etcdplus](https://github.com/derekchiang/etcdplus) - A set of distributed synchronization primitives built upon etcd
- [go-discover](https://github.com/flynn/go-discover) - service discovery in Go
- [gleicon/goreman](https://github.com/gleicon/goreman/tree/etcd) - Branch of the Go Foreman clone with etcd support

View File

@ -82,30 +82,7 @@ All these metrics are prefixed with `etcd_network_`
### gRPC requests
These metrics describe the requests served by a specific etcd member: total received requests, total failed requests, and processing latency. They are useful for tracking user-generated traffic hitting the etcd cluster.
All these metrics are prefixed with `etcd_grpc_`
| Name | Description | Type |
|--------------------------------|-------------------------------------------------------------------------------------|------------------------|
| requests_total | Total number of received requests | Counter(method) |
| requests_failed_total | Total number of failed requests.   | Counter(method,error) |
| unary_requests_duration_seconds | Bucketed handling duration of the requests. | Histogram(method) |
Example Prometheus queries that may be useful from these metrics (across all etcd members):
* `sum(rate(etcd_grpc_requests_failed_total{job="etcd"}[1m]) by (grpc_method) / sum(rate(etcd_grpc_total{job="etcd"})[1m]) by (grpc_method)`
Shows the fraction of events that failed by gRPC method across all members, across a time window of `1m`.
* `sum(rate(etcd_grpc_requests_total{job="etcd",grpc_method="PUT"})[1m]) by (grpc_method)`
Shows the rate of PUT requests across all members, across a time window of `1m`.
* `histogram_quantile(0.9, sum(rate(etcd_grpc_unary_requests_duration_seconds{job="etcd",grpc_method="PUT"}[5m]) ) by (le))`
Show the 0.90-tile latency (in seconds) of PUT request handling across all members, with a window of `5m`.
These metrics are exposed via [go-grpc-prometheus][go-grpc-prometheus].
## etcd_debugging namespace metrics
@ -136,3 +113,4 @@ Heavy file descriptor (`process_open_fds`) usage (i.e., near the process's file
[prometheus-getting-started]: http://prometheus.io/docs/introduction/getting_started/
[prometheus-naming]: http://prometheus.io/docs/practices/naming/
[v2-http-metrics]: v2/metrics.md#http-requests
[go-grpc-prometheus]: https://github.com/grpc-ecosystem/go-grpc-prometheus

View File

@ -83,7 +83,7 @@ A cluster using self-signed certificates both encrypts traffic and authenticates
On each machine, etcd would be started with these flags:
```
$ etcd --name infra0 --initial-advertise-peer-urls http://10.0.1.10:2380 \
$ etcd --name infra0 --initial-advertise-peer-urls https://10.0.1.10:2380 \
--listen-peer-urls https://10.0.1.10:2380 \
--listen-client-urls https://10.0.1.10:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://10.0.1.10:2379 \
@ -126,7 +126,7 @@ $ etcd --name infra2 --initial-advertise-peer-urls https://10.0.1.12:2380 \
If the cluster needs encrypted communication but does not require authenticated connections, etcd can be configured to automatically generate its keys. On initialization, each member creates its own set of keys based on its advertised IP addresses and hosts.
On each machine, etcd would be started with these flag:
On each machine, etcd would be started with these flags:
```
$ etcd --name infra0 --initial-advertise-peer-urls https://10.0.1.10:2380 \
@ -205,7 +205,7 @@ exit 1
## Discovery
In a number of cases, the IPs of the cluster peers may not be known ahead of time. This is common when utilizing cloud providers or when the network uses DHCP. In these cases, rather than specifying a static configuration, use an existing etcd cluster to bootstrap a new one. We call this process "discovery".
In a number of cases, the IPs of the cluster peers may not be known ahead of time. This is common when utilizing cloud providers or when the network uses DHCP. In these cases, rather than specifying a static configuration, use an existing etcd cluster to bootstrap a new one. This process is called "discovery".
There two methods that can be used for discovery:
@ -214,17 +214,17 @@ There two methods that can be used for discovery:
### etcd discovery
To better understand the design about discovery service protocol, we suggest reading the discovery service protocol [documentation][discovery-proto].
To better understand the design of the discovery service protocol, we suggest reading the discovery service protocol [documentation][discovery-proto].
#### Lifetime of a discovery URL
A discovery URL identifies a unique etcd cluster. Instead of reusing a discovery URL, always create discovery URLs for new clusters.
A discovery URL identifies a unique etcd cluster. Instead of reusing an existing discovery URL, each etcd instance shares a new discovery URL to bootstrap the new cluster.
Moreover, discovery URLs should ONLY be used for the initial bootstrapping of a cluster. To change cluster membership after the cluster is already running, see the [runtime reconfiguration][runtime-conf] guide.
#### Custom etcd discovery service
Discovery uses an existing cluster to bootstrap itself. If using a private etcd cluster, can create a URL like so:
Discovery uses an existing cluster to bootstrap itself. If using a private etcd cluster, create a URL like so:
```
$ curl -X PUT https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83/_config/size -d value=3
@ -271,7 +271,7 @@ $ curl https://discovery.etcd.io/new?size=3
https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
This will create the cluster with an initial expected size of 3 members. If no size is specified, a default of 3 is used.
This will create the cluster with an initial size of 3 members. If no size is specified, a default of 3 is used.
```
ETCD_DISCOVERY=https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
@ -281,7 +281,7 @@ ETCD_DISCOVERY=https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573d
--discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
**Each member must have a different name flag specified. `Hostname` or `machine-id` can be a good choice. Or discovery will fail due to duplicated name.**
**Each member must have a different name flag specified or else discovery will fail due to duplicated names. `Hostname` or `machine-id` can be a good choice. **
Now we start etcd with those relevant flags for each member:
@ -475,5 +475,5 @@ To setup an etcd cluster with proxies of v2 API, please read the the [clustering
[proxy]: https://github.com/coreos/etcd/blob/release-2.3/Documentation/proxy.md
[clustering_etcd2]: https://github.com/coreos/etcd/blob/release-2.3/Documentation/clustering.md
[security-guide]: security.md
[tls-setup]: /hack/tls-setup
[tls-setup]: ../../hack/tls-setup
[gateway]: gateway.md

View File

@ -247,7 +247,7 @@ The security flags help to [build a secure etcd cluster][security].
+ env variable: ETCD_DEBUG
### --log-package-levels
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ default: none (INFO for all packages)
+ env variable: ETCD_LOG_PACKAGE_LEVELS
@ -279,10 +279,14 @@ Follow the instructions when using these flags.
+ Enable runtime profiling data via HTTP server. Address is at client URL + "/debug/pprof/"
+ default: false
### --metrics
+ Set level of detail for exported metrics, specify 'extensive' to include histogram metrics.
+ default: basic
[build-cluster]: clustering.md#static
[reconfig]: runtime-configuration.md
[discovery]: clustering.md#discovery
[iana-ports]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=etcd
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
[proxy]: ../v2/proxy.md
[restore]: ../v2/admin_guide.md#restoring-a-backup
[security]: security.md

View File

@ -57,7 +57,7 @@ sudo rkt run --net=default:IP=${NODE3} coreos.com/etcd:v3.0.6 -- -name=node3 -ad
Verify the cluster is healthy and can be reached.
```
ETCDCTL_API=3 etcdctl --endpoints=http://172.16.28.21:2379,http://172.16.28.22:2379,http://172.16.28.23:2379 endpoint-health
ETCDCTL_API=3 etcdctl --endpoints=http://172.16.28.21:2379,http://172.16.28.22:2379,http://172.16.28.23:2379 endpoint health
```
### DNS

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

File diff suppressed because it is too large Load Diff

View File

@ -1,6 +1,6 @@
# gRPC proxy
*This is a pre-alpha feature, we are looking for early feedback.*
*This is an alpha feature, we are looking for early feedback.*
The gRPC proxy is a stateless etcd reverse proxy operating at the gRPC layer (L7). The proxy is designed to reduce the total processing load on the core etcd cluster. For horizontal scalability, it coalesces watch and lease API requests. To protect the cluster against abusive clients, it caches key range requests.
@ -36,9 +36,9 @@ watch key A ^ ^ watch key A |
To effectively coalesce multiple client watchers into a single watcher, the gRPC proxy coalesces new `c-watchers` into an existing `s-watcher` when possible. This coalesced `s-watcher` may be out of sync with the etcd server due to network delays or buffered undelivered events. When the watch revision is unspecified, the gRPC proxy will not guarantee the `c-watcher` will start watching from the most recent store revision. For example, if a client watches from an etcd server with revision 1000, that watcher will begin at revision 1000. If a client watches from the gRPC proxy, may begin watching from revision 990.
Similar limitations apply to cancellation. When the watcher is cancelled, the etcd servers revision may be greater than the cancellation response revision.
Similar limitations apply to cancellation. When the watcher is cancelled, the etcd servers revision may be greater than the cancellation response revision.
These two limitations should not cause problems for most use cases. In the future, there may be additional options to force the watcher to bypass the gRPC proxy for more accurate revision responses.
These two limitations should not cause problems for most use cases. In the future, there may be additional options to force the watcher to bypass the gRPC proxy for more accurate revision responses.
## Scalable lease API
@ -47,3 +47,32 @@ TODO
## Abusive clients protection
The gRPC proxy caches responses for requests when it does not break consistency requirements. This can protect the etcd server from abusive clients in tight for loops.
## Start etcd gRPC proxy
Consider an etcd cluster with the following static endpoints:
|Name|Address|Hostname|
|------|---------|------------------|
|infra0|10.0.1.10|infra0.example.com|
|infra1|10.0.1.11|infra1.example.com|
|infra2|10.0.1.12|infra2.example.com|
Start the etcd gRPC proxy to use these static endpoints with the command:
```bash
$ etcd grpc-proxy start --endpoints=infra0.example.com,infra1.example.com,infra2.example.com --listen-addr=127.0.0.1:2379
```
The etcd gRPC proxy starts and listens on port 8080. It forwards client requests to one of the three endpoints provided above.
Sending requests through the proxy:
```bash
$ ETCDCTL_API=3 ./etcdctl --endpoints=127.0.0.1:2379 put foo bar
OK
$ ETCDCTL_API=3 ./etcdctl --endpoints=127.0.0.1:2379 get foo
foo
bar
```

View File

@ -0,0 +1,93 @@
# Hardware recommendations
etcd usually runs well with limited resources for development or testing purposes; its common to develop with etcd on a laptop or a cheap cloud machine. However, when running etcd clusters in production, some hardware guidelines are useful for proper administration. These suggestions are not hard rules; they serve as a good starting point for a robust production deployment. As always, deployments should be tested with simulated workloads before running in production.
## CPUs
Few etcd deployments require a lot of CPU capacity. Typical clusters need two to four cores to run smoothly.
Heavily loaded etcd deployments, serving thousands of clients or tens of thousands of requests per second, tend to be CPU bound since etcd can serve requests from memory. Such heavy deployments usually need eight to sixteen dedicated cores.
## Memory
etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 8GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly.
## Disks
Fast disks are the most critical factor for etcd deployment performance and stability.
A slow disk will increase etcd request latency and potentially hurt cluster stability. Since etcds consensus protocol depends on persistently storing metadata to a log, a majority of etcd cluster members must write every request down to disk. Additionally, etcd will also incrementally checkpoint its state to disk so it can truncate this log. If these writes take too long, heartbeats may time out and trigger an election, undermining the stability of the cluster.
etcd is very sensitive to disk write latency. Typically 50 sequential IOPS (e.g., a 7200 RPM disk) is required. For heavily loaded clusters, 500 sequential IOPS (e.g., a typical local SSD or a high performance virtualized block device) is recommended. Note that most cloud providers publish concurrent IOPS rather than sequential IOPS; the published concurrent IOPS can be 10x greater than the sequential IOPS. To measure actual sequential IOPS, we suggest using a disk benchmarking tool such as [diskbench][diskbench] or [fio][fio].
etcd requires only modest disk bandwidth but more disk bandwidth buys faster recovery times when a failed member has to catch up with the cluster. Typically 10MB/s will recover 100MB data within 15 seconds. For large clusters, 100MB/s or higher is suggested for recovering 1GB data within 15 seconds.
When possible, back etcds storage with a SSD. A SSD usually provides lower write latencies and with less variance than a spinning disk, thus improving the stability and reliability of etcd. If using spinning disk, get the fastest disks possible (15,000 RPM). Using RAID 0 is also an effective way to increase disk speed, for both spinning disks and SSD. With at least three cluster members, mirroring and/or parity variants of RAID are unnecessary; etcd's consistent replication already gets high availability.
## Network
Multi-member etcd deployments benefit from a fast and reliable network. In order for etcd to be both consistent and partition tolerant, an unreliable network with partitioning outages will lead to poor availability. Low latency ensures etcd members can communicate fast. High bandwidth can reduce the time to recover a failed etcd member. 1GbE is sufficient for common etcd deployments. For large etcd clusters, a 10GbE network will reduce mean time to recovery.
Deploy etcd members within a single data center when possible to avoid latency overheads and lessen the possibility of partitioning events. If a failure domain in another data center is required, choose a data center closer to the existing one. Please also read the [tuning][tuning] documentation for more information on cross data center deployment.
## Example hardware configurations
Here are a few example hardware setups on AWS and GCE environments. As mentioned before, but must be stressed regardless, administrators should test an etcd deployment with a simulated workload before putting it into production.
Note that these configurations assume these machines are totally dedicated to etcd. Running other applications along with etcd on these machines may cause resource contentions and lead to cluster instability.
### Small cluster
A small cluster serves fewer than 100 clients, fewer than 200 of requests per second, and stores no more than 100MB of data.
Example application workload: A 50-node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|----------|------|-------|--------|------|----------------|
| AWS | m4.large | 2 | 8 | 3600 | 56.25 |
| GCE | n1-standard-1 + 50GB PD SSD | 2 | 7.5 | 1500 | 25 |
### Medium cluster
A medium cluster serves fewer than 500 clients, fewer than 1,000 of requests per second, and stores no more than 500MB of data.
Example application workload: A 250-node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|----------|------|-------|--------|------|----------------|
| AWS | m4.xlarge | 4 | 16 | 6000 | 93.75 |
| GCE | n1-standard-4 + 150GB PD SSD | 4 | 15 | 4500 | 75 |
### Large cluster
A large cluster serves fewer than 1,500 clients, fewer than 10,000 of requests per second, and stores no more than 1GB of data.
Example application workload: A 1,000-node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|----------|------|-------|--------|------|----------------|
| AWS | m4.2xlarge | 8 | 32 | 8000 | 125 |
| GCE | n1-standard-8 + 250GB PD SSD | 8 | 30 | 7500 | 125 |
### xLarge cluster
An xLarge cluster serves more than 1,500 clients, more than 10,000 of requests per second, and stores more than 1GB data.
Example application workload: A 3,000 node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|----------|------|-------|--------|------|----------------|
| AWS | m4.4xlarge | 16 | 64 | 16,000 | 250 |
| GCE | n1-standard-16 + 500GB PD SSD | 16 | 60 | 15,000 | 250 |
[diskbench]: https://github.com/ongardie/diskbenchmark
[fio]: https://github.com/axboe/fio
[tuning]: ../tuning.md

View File

@ -0,0 +1,82 @@
# Monitoring etcd
Each etcd server exports metrics under the `/metrics` path on its client port.
The metrics can be fetched with `curl`:
```sh
$ curl -L http://localhost:2379/metrics
# HELP etcd_debugging_mvcc_keys_total Total number of keys.
# TYPE etcd_debugging_mvcc_keys_total gauge
etcd_debugging_mvcc_keys_total 0
# HELP etcd_debugging_mvcc_pending_events_total Total number of pending events to be sent.
# TYPE etcd_debugging_mvcc_pending_events_total gauge
etcd_debugging_mvcc_pending_events_total 0
...
```
## Prometheus
Running a [Prometheus][prometheus] monitoring service is the easiest way to ingest and record etcd's metrics.
First, install Prometheus:
```sh
PROMETHEUS_VERSION="1.3.1"
wget https://github.com/prometheus/prometheus/releases/download/v$PROMETHEUS_VERSION/prometheus-$PROMETHEUS_VERSION.linux-amd64.tar.gz -O /tmp/prometheus-$PROMETHEUS_VERSION.linux-amd64.tar.gz
tar -xvzf /tmp/prometheus-$PROMETHEUS_VERSION.linux-amd64.tar.gz --directory /tmp/ --strip-components=1
/tmp/prometheus -version
```
Set Prometheus's scraper to target the etcd cluster endpoints:
```sh
cat > /tmp/test-etcd.yaml <<EOF
global:
scrape_interval: 10s
scrape_configs:
- job_name: test-etcd
static_configs:
- targets: ['10.240.0.32:2379','10.240.0.33:2379','10.240.0.34:2379']
EOF
cat /tmp/test-etcd.yaml
```
Set up the Prometheus handler:
```sh
nohup /tmp/prometheus \
-config.file /tmp/test-etcd.yaml \
-web.listen-address ":9090" \
-storage.local.path "test-etcd.data" >> /tmp/test-etcd.log 2>&1 &
```
Now Prometheus will scrape etcd metrics every 10 seconds.
## Grafana
[Grafana][grafana] has built-in Prometheus support; just add a Prometheus data source:
```
Name: test-etcd
Type: Prometheus
Url: http://localhost:9090
Access: proxy
```
Then import the default [etcd dashboard template][template] and customize. For instance, if Prometheus data source name is `my-etcd`, the `datasource` field values in JSON also need to be `my-etcd`.
See the [demo][demo].
Sample dashboard:
![](./etcd-sample-grafana.png)
[prometheus]: https://prometheus.io/
[grafana]: http://grafana.org/
[template]: ./grafana.json
[demo]: http://dash.etcd.io/dashboard/db/test-etcd

View File

@ -11,7 +11,7 @@ To recover from disastrous failure, etcd v3 provides snapshot and restore facili
Recovering a cluster first needs a snapshot of the keyspace from an etcd member. A snapshot may either be taken from a live member with the `etcdctl snapshot save` command or by copying the `member/snap/db` file from an etcd data directory. For example, the following command snapshots the keyspace served by `$ENDPOINT` to the file `snapshot.db`:
```sh
$ etcdctl --endpoints $ENDPOINT snapshot save snapshot.db
$ ETCDCTL_API=3 etcdctl --endpoints $ENDPOINT snapshot save snapshot.db
```
### Restoring a cluster
@ -23,19 +23,19 @@ Snapshot integrity may be optionally verified at restore time. If the snapshot i
A restore initializes a new member of a new cluster, with a fresh cluster configuration using `etcd`'s cluster configuration flags, but preserves the contents of the etcd keyspace. Continuing from the previous example, the following creates new etcd data directories (`m1.etcd`, `m2.etcd`, `m3.etcd`) for a three member cluster:
```sh
$ etcdctl snapshot restore snapshot.db \
$ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
--name m1 \
--initial-cluster m1=http:/host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-advertise-peer-urls http://host1:2380
$ etcdctl snapshot restore snapshot.db \
$ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
--name m2 \
--initial-cluster m1=http:/host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-advertise-peer-urls http://host2:2380
$ etcdctl snapshot restore snapshot.db \
$ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
--name m3 \
--initial-cluster m1=http:/host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-advertise-peer-urls http://host3:2380
```

View File

@ -169,7 +169,7 @@ As described in the above, the best practice of adding new members is to configu
For avoiding this problem, etcd provides an option `-strict-reconfig-check`. If this option is passed to etcd, etcd rejects reconfiguration requests if the number of started members will be less than a quorum of the reconfigured cluster.
It is recommended to enable this option. However, it is disabled by default because of keeping compatibility.
It is enabled by default.
[add member]: #add-a-new-member
[cluster-reconf]: #cluster-reconfiguration-operations

View File

@ -219,6 +219,6 @@ Make sure to sign the certificates with a Subject Name the member's public IP ad
The certificate needs to be signed for the member's FQDN in its Subject Name, use Subject Alternative Names (short IP SANs) to add the IP address. The `etcd-ca` tool provides `--domain=` option for its `new-cert` command, and openssl can make [it][alt-name] too.
[cfssl]: https://github.com/cloudflare/cfssl
[tls-setup]: /hack/tls-setup
[tls-setup]: ../../hack/tls-setup
[tls-guide]: https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md
[alt-name]: http://wiki.cacert.org/FAQ/subjectAltName

View File

@ -50,7 +50,7 @@ Radius Intelligence uses Kubernetes running CoreOS to containerize and scale int
## Vonage
- *Application*: system configuration for microservices, scheduling, locks (future - service discovery)
- *Application*: system configuration for microservices, scheduling, locks (future - service discovery)
- *Launched*: August 2015
- *Cluster Size*: 2 clusters of 5 members in 2 DCs, n local proxies 1-to-1 with microservice, (ssl and SRV look up)
- *Order of Data Size*: kilobytes
@ -60,3 +60,148 @@ Radius Intelligence uses Kubernetes running CoreOS to containerize and scale int
[teamcity]: https://www.jetbrains.com/teamcity/
[raoofm]:https://github.com/raoofm
## Qiniu Cloud
- *Application*: system configuration for microservices, distributed locks
- *Launched*: Jan. 2016
- *Cluster Size*: 3 members each with several clusters
- *Order of Data Size*: kilobytes
- *Operator*: Pandora, chenchao@qiniu.com
- *Environment*: Baremetal
- *Backups*: None, all data can be recreated if necessary
## QingCloud
- *Application*: [QingCloud][qingcloud] appcenter cluster for service discovery as [metad][metad] backend.
- *Launched*: December 2016
- *Cluster Size*: 1 cluster of 3 members per user.
- *Order of Data Size*: kilobytes
- *Operator*: [yunify][yunify]
- *Environment*: QingCloud IaaS
- *Backups*: None, all data can be recreated if necessary.
[metad]:https://github.com/yunify/metad
[yunify]:https://github.com/yunify
[qingcloud]:https://qingcloud.com/
## Yandex
- *Application*: system configuration for services, service discovery
- *Launched*: March 2016
- *Cluster Size*: 3 clusters of 5 members
- *Order of Data Size*: several gigabytes
- *Operator*: Yandex; [nekto0n][nekto0n]
- *Environment*: Bare Metal
- *Backups*: None
[nekto0n]:https://github.com/nekto0n
## Tencent Games
- *Application*: Meta data and configuration data for service discovery, Kubernetes, etc.
- *Launched*: Jan. 2015
- *Cluster Size*: 3 members each with 10s of clusters
- *Order of Data Size*: 10s of Megabytes
- *Operator*: Tencent Game Operations Department
- *Environment*: Baremetal
- *Backups*: Periodic sync to backup server
In Tencent games, we use Docker and Kubernetes to deploy and run our applications, and use etcd to save meta data for service discovery, Kubernetes, etc.
## Hyper.sh
- *Application*: Kubernetes, distributed locks, etc.
- *Launched*: April 2016
- *Cluster Size*: 1 cluster of 3 members
- *Order of Data Size*: 10s of MB
- *Operator*: Hyper.sh
- *Environment*: Baremetal
- *Backups*: None, all data can be recreated if necessary.
In [hyper.sh][hyper.sh], the container service is backed by [hypernetes][hypernetes], a multi-tenant kubernetes distro. Moreover, we use etcd to coordinate the multiple manage services and store global meta data.
[hypernetes]:https://github.com/hyperhq/hypernetes
[Hyper.sh]:https://www.hyper.sh
## Meitu
- *Application*: system configuration for services, service discovery, kubernetes in test environment
- *Launched*: October 2015
- *Cluster Size*: 1 cluster of 3 members
- *Order of Data Size*: megabytes
- *Operator*: Meitu, hxj@meitu.com, [shafreeck][shafreeck]
- *Environment*: Bare Metal
- *Backups*: None, all data can be recreated if necessary.
[shafreeck]:https://github.com/shafreeck
## Grab
- *Application*: system configuration for services, service discovery
- *Launched*: June 2016
- *Cluster Size*: 1 cluster of 7 members
- *Order of Data Size*: megabytes
- *Operator*: Grab, [taxitan][taxitan], [reterVision][reterVision]
- *Environment*: AWS
- *Backups*: None, all data can be recreated if necessary.
[taxitan]:https://github.com/taxitan
[reterVision]:https://github.com/reterVision
## DaoCloud.io
- *Application*: container management
- *Launched*: Sep. 2015
- *Cluster Size*: 1000+ deployments, each deployment contains a 3 node cluster.
- *Order of Data Size*: 100s of Megabytes
- *Operator*: daocloud.io
- *Environment*: Baremetal and virtual machines
- *Backups*: None, all data can be recreated if necessary.
In [DaoCloud][DaoCloud], we use Docker and Swarm to deploy and run our applications, and we use etcd to save metadata for service discovery.
[DaoCloud]:https://www.daocloud.io
## Branch.io
- *Application*: Kubernetes
- *Launched*: April 2016
- *Cluster Size*: Multiple clusters, multiple sizes
- *Order of Data Size*: 100s of Megabytes
- *Operator*: branch.io
- *Environment*: AWS, Kubernetes
- *Backups*: EBS volume backups
At [Branch][branch], we use kubernetes heavily as our core microservice platform for staging and production.
[branch]: https://branch.io
## Baidu Waimai
- *Application*: SkyDNS, Kubernetes, UDC, CMDB and other distributed systems
- *Launched*: April. 2016
- *Cluster Size*: 3 clusters of 5 members
- *Order of Data Size*: several gigabytes
- *Operator*: Baidu Waimai Operations Department
- *Environment*: CentOS 6.5
- *Backups*: backup scripts
## Salesforce.com
- *Application*: Kubernetes
- *Launched*: Jan 2017
- *Cluster Size*: Multiple clusters of 3 members
- *Order of Data Size*: 100s of Megabytes
- *Operator*: Salesforce.com (krmayankk@github)
- *Environment*: BareMetal
- *Backups*: None, all data can be recreated
## Hosted Graphite
- *Application*: Service discovery, locking, ephemeral application data
- *Launched*: January 2017
- *Cluster Size*: 2 clusters of 7 members
- *Order of Data Size*: Megabytes
- *Operator*: Hosted Graphite (sre@hostedgraphite.com)
- *Environment*: Bare Metal
- *Backups*: None, all data is considered ephemeral.

View File

@ -1,6 +1,6 @@
# Reporting bugs
If any part of the etcd project has bugs or documentation mistakes, please let us know by [opening an issue][issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
If any part of the etcd project has bugs or documentation mistakes, please let us know by [opening an issue][etcd-issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
To make the bug report accurate and easy to understand, please try to create bug reports that are:

View File

@ -6,27 +6,29 @@ In the general case, upgrading from etcd 2.3 to 3.0 can be a zero-downtime, roll
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade Checklists
### Upgrade checklists
#### Upgrade Requirements
**NOTE:** When [migrating from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
To upgrade an existing etcd deployment to 3.0, the running cluster must be 2.3 or greater. If it's before 2.3, please upgrade to [2.3](https://github.com/coreos/etcd/releases/tag/v2.3.0) before upgrading to 3.0.
#### Upgrade requirements
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. You can check the health of the cluster by using the `etcdctl cluster-health` command.
To upgrade an existing etcd deployment to 3.0, the running cluster must be 2.3 or greater. If it's before 2.3, please upgrade to [2.3](https://github.com/coreos/etcd/releases/tag/v2.3.8) before upgrading to 3.0.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl cluster-health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data directory](../v2/admin_guide.md#backing-up-the-datastore). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version.
Before beginning, [backup the etcd data directory](../v2/admin_guide.md#backing-up-the-datastore). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version.
#### Mixed Versions
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.0. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
It might take up to 2 minutes for the newly upgraded member to catch up with the existing cluster when the total data size is larger than 50MB. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
It might take up to 2 minutes for the newly upgraded member to catch up with the existing cluster when the total data size is larger than 50MB. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and well be happy to provide advice on the procedure.
@ -36,13 +38,13 @@ If all members have been upgraded to v3.0, the cluster will be upgraded to v3.0,
Please [backup the data directory](../v2/admin_guide.md#backing-up-the-datastore) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade Procedure
### Upgrade procedure
This example details the upgrade of a three-member v2.3 ectd cluster running on a local machine.
This example details the upgrade of a three-member v2.3 ectd cluster running on a local machine.
#### 1. Check upgrade requirements.
Is the the cluster healthy and running v.2.3.x?
Is the cluster healthy and running v.2.3.x?
```
$ etcdctl cluster-health
@ -52,7 +54,7 @@ member 8211f1d0f64f3269 is healthy: got healthy result from http://localhost:123
cluster is healthy
$ curl http://localhost:2379/version
{"etcdserver":"2.3.x","etcdcluster":"2.3.0"}
{"etcdserver":"2.3.x","etcdcluster":"2.3.8"}
```
#### 2. Stop the existing etcd process
@ -64,7 +66,7 @@ When each etcd process is stopped, expected errors will be logged by other clust
2016-06-27 15:21:48.624175 I | rafthttp: the connection with 8211f1d0f64f3269 became inactive
```
Its a good idea at this point to [backup the etcd data directory](../v2/admin_guide.md#backing-up-the-datastore) to provide a downgrade path should any problems occur:
Its a good idea at this point to [backup the etcd data directory](../v2/admin_guide.md#backing-up-the-datastore) to provide a downgrade path should any problems occur:
```
$ etcdctl backup \
@ -102,7 +104,7 @@ Upgraded members will log warnings like the following until the entire cluster i
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.0 successfully:
When all members are upgraded, the cluster will report upgrading to 3.0 successfully:
```
2016-06-27 15:22:19.873751 N | membership: updated the cluster version from 2.3 to 3.0
@ -116,4 +118,14 @@ $ ETCDCTL_API=3 etcdctl endpoint health
127.0.0.1:22379 is healthy: successfully committed proposal: took = 18.513301ms
```
## Further considerations
- etcdctl environment variables have been updated. If `ETCDCTL_API=2 etcdctl cluster-health` works properly but `ETCDCTL_API=3 etcdctl endpoints health` responds with `Error: grpc: timed out when dialing`, be sure to use the [new variable names](https://github.com/coreos/etcd/tree/master/etcdctl#etcdctl).
## Known Issues
- etcd &lt; v3.1 does not work properly if built with Go &gt; v1.7. See [Issue 6951](https://github.com/coreos/etcd/issues/6951) for additional information.
- If an error such as `transport: http2Client.notifyError got notified that the client transport was broken unexpected EOF.` shows up in the etcd server logs, be sure etcd is a pre-built release or built with (etcd v3.1+ &amp; go v1.7+) or (etcd &lt;v3.1 &amp; go v1.6.x).
- Adding a v3 node to v2.3 cluster during upgrades is not supported and could trigger panics. See [Issue 7249](https://github.com/coreos/etcd/issues/7429) for additional information. Mixed versions of etcd members are only allowed during v3 migration. Finish upgrades before making any membership changes.
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,134 @@
## Upgrade etcd from 3.0 to 3.1
In the general case, upgrading from etcd 3.0 to 3.1 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v3.0 processes and replace them with etcd v3.1 processes
- after running all v3.1 processes, new features in v3.1 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade checklists
**NOTE:** When [migrating from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
#### Monitoring
Following metrics from v3.0.x have been deprecated in favor of [go-grpc-prometheus](https://github.com/grpc-ecosystem/go-grpc-prometheus):
- `etcd_grpc_requests_total`
- `etcd_grpc_requests_failed_total`
- `etcd_grpc_active_streams`
- `etcd_grpc_unary_requests_duration_seconds`
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.1, the running cluster must be 3.0 or greater. If it's before 3.0, please [upgrade to 3.0](upgrade_3_0.md) before upgrading to 3.1.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl endpoint health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data](../op-guide/maintenance.md#snapshot-backup). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version. Please note that the `snapshot` command only backs up the v3 data. For v2 data, see [backing up v2 datastore](../v2/admin_guide.md#backing-up-the-datastore).
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.1. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
If the cluster is serving a v2 data set larger than 50MB, each newly upgraded member may take up to two minutes to catch up with the existing cluster. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and we'll be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.1, the cluster will be upgraded to v3.1, and downgrade from this completed state is **not possible**. If any single member is still v3.0, however, the cluster and its operations remains "v3.0", and it is possible from this mixed cluster state to return to using a v3.0 etcd binary on all members.
Please [backup the data directory](../op-guide/maintenance.md#snapshot-backup) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example shows how to upgrade a 3-member v3.0 ectd cluster running on a local machine.
#### 1. Check upgrade requirements
Is the cluster healthy and running v3.0.x?
```
$ ETCDCTL_API=3 etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 6.600684ms
localhost:22379 is healthy: successfully committed proposal: took = 8.540064ms
localhost:32379 is healthy: successfully committed proposal: took = 8.763432ms
$ curl http://localhost:2379/version
{"etcdserver":"3.0.16","etcdcluster":"3.0.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
2017-01-17 09:34:18.352662 I | raft: raft.node: 1640829d9eea5cfb elected leader 1640829d9eea5cfb at term 5
2017-01-17 09:34:18.359630 W | etcdserver: failed to reach the peerURL(http://localhost:2380) of member fd32987dcd0511e0 (Get http://localhost:2380/version: dial tcp 127.0.0.1:2380: getsockopt: connection refused)
2017-01-17 09:34:18.359679 W | etcdserver: cannot get the version of member fd32987dcd0511e0 (Get http://localhost:2380/version: dial tcp 127.0.0.1:2380: getsockopt: connection refused)
2017-01-17 09:34:18.548116 W | rafthttp: lost the TCP streaming connection with peer fd32987dcd0511e0 (stream Message writer)
2017-01-17 09:34:19.147816 W | rafthttp: lost the TCP streaming connection with peer fd32987dcd0511e0 (stream MsgApp v2 writer)
2017-01-17 09:34:34.364907 W | etcdserver: failed to reach the peerURL(http://localhost:2380) of member fd32987dcd0511e0 (Get http://localhost:2380/version: dial tcp 127.0.0.1:2380: getsockopt: connection refused)
```
It's a good idea at this point to [backup the etcd data](../op-guide/maintenance.md#snapshot-backup) to provide a downgrade path should any problems occur:
```
$ etcdctl snapshot save backup.db
```
#### 3. Drop-in etcd v3.1 binary and start the new etcd process
The new v3.1 etcd will publish its information to the cluster:
```
2017-01-17 09:36:00.996590 I | etcdserver: published {Name:my-etcd-1 ClientURLs:[http://localhost:2379]} to cluster 46bc3ce73049e678
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.1 etcd binary:
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:22379 is healthy: successfully committed proposal: took = 5.540129ms
localhost:32379 is healthy: successfully committed proposal: took = 7.321671ms
localhost:2379 is healthy: successfully committed proposal: took = 10.629901ms
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.1:
```
2017-01-17 09:36:38.406268 W | etcdserver: the local etcd version 3.0.16 is not up-to-date
2017-01-17 09:36:38.406295 W | etcdserver: member fd32987dcd0511e0 has a higher version 3.1.0
2017-01-17 09:36:42.407695 W | etcdserver: the local etcd version 3.0.16 is not up-to-date
2017-01-17 09:36:42.407730 W | etcdserver: member fd32987dcd0511e0 has a higher version 3.1.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.1 successfully:
```
2017-01-17 09:37:03.100015 I | etcdserver: updating the cluster version from 3.0 to 3.1
2017-01-17 09:37:03.104263 N | etcdserver/membership: updated the cluster version from 3.0 to 3.1
2017-01-17 09:37:03.104374 I | etcdserver/api: enabled capabilities for version 3.1
```
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 2.312897ms
localhost:22379 is healthy: successfully committed proposal: took = 2.553476ms
localhost:32379 is healthy: successfully committed proposal: took = 2.516902ms
```
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,338 @@
## Upgrade etcd from 3.1 to 3.2
In the general case, upgrading from etcd 3.1 to 3.2 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v3.1 processes and replace them with etcd v3.2 processes
- after running all v3.2 processes, new features in v3.2 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade checklists
**NOTE:** When [migrating from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
Highlighted breaking changes in 3.2.
#### Change in default `snapshot-count` value
The default value of `--snapshot-count` has [changed from from 10,000 to 100,000](https://github.com/coreos/etcd/pull/7160). Higher snapshot count means it holds Raft entries in memory for longer before discarding old entries. It is a trade-off between less frequent snapshotting and [higher memory usage](https://github.com/kubernetes/kubernetes/issues/60589#issuecomment-371977156). Higher `--snapshot-count` will be manifested with higher memory usage, while retaining more Raft entries helps with the availabilities of slow followers: leader is still able to replicate its logs to followers, rather than forcing followers to rebuild its stores from leader snapshots.
#### Change in gRPC dependency (>=3.2.10)
3.2.10 or later now requires [grpc/grpc-go](https://github.com/grpc/grpc-go/releases) `v1.7.5` (<=3.2.9 requires `v1.2.1`).
##### Deprecate `grpclog.Logger`
`grpclog.Logger` has been deprecated in favor of [`grpclog.LoggerV2`](https://github.com/grpc/grpc-go/blob/master/grpclog/loggerv2.go). `clientv3.Logger` is now `grpclog.LoggerV2`.
Before
```go
import "github.com/coreos/etcd/clientv3"
clientv3.SetLogger(log.New(os.Stderr, "grpc: ", 0))
```
After
```go
import "github.com/coreos/etcd/clientv3"
import "google.golang.org/grpc/grpclog"
clientv3.SetLogger(grpclog.NewLoggerV2(os.Stderr, os.Stderr, os.Stderr))
// log.New above cannot be used (not implement grpclog.LoggerV2 interface)
```
##### Deprecate `grpc.ErrClientConnTimeout`
Previously, `grpc.ErrClientConnTimeout` error is returned on client dial time-outs. 3.2 instead returns `context.DeadlineExceeded` (see [#8504](https://github.com/coreos/etcd/issues/8504)).
Before
```go
// expect dial time-out on ipv4 blackhole
_, err := clientv3.New(clientv3.Config{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: 2 * time.Second
})
if err == grpc.ErrClientConnTimeout {
// handle errors
}
```
After
```go
_, err := clientv3.New(clientv3.Config{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: 2 * time.Second
})
if err == context.DeadlineExceeded {
// handle errors
}
```
#### Change in maximum request size limits (>=3.2.10)
3.2.10 and 3.2.11 allow custom request size limits in server side. >=3.2.12 allows custom request size limits for both server and **client side**. In previous versions(v3.2.10, v3.2.11), client response size was limited to only 4 MiB.
Server-side request limits can be configured with `--max-request-bytes` flag:
```bash
# limits request size to 1.5 KiB
etcd --max-request-bytes 1536
# client writes exceeding 1.5 KiB will be rejected
etcdctl put foo [LARGE VALUE...]
# etcdserver: request is too large
```
Or configure `embed.Config.MaxRequestBytes` field:
```go
import "github.com/coreos/etcd/embed"
import "github.com/coreos/etcd/etcdserver/api/v3rpc/rpctypes"
// limit requests to 5 MiB
cfg := embed.NewConfig()
cfg.MaxRequestBytes = 5 * 1024 * 1024
// client writes exceeding 5 MiB will be rejected
_, err := cli.Put(ctx, "foo", [LARGE VALUE...])
err == rpctypes.ErrRequestTooLarge
```
**If not specified, server-side limit defaults to 1.5 MiB**.
Client-side request limits must be configured based on server-side limits.
```bash
# limits request size to 1 MiB
etcd --max-request-bytes 1048576
```
```go
import "github.com/coreos/etcd/clientv3"
cli, _ := clientv3.New(clientv3.Config{
Endpoints: []string{"127.0.0.1:2379"},
MaxCallSendMsgSize: 2 * 1024 * 1024,
MaxCallRecvMsgSize: 3 * 1024 * 1024,
})
// client writes exceeding "--max-request-bytes" will be rejected from etcd server
_, err := cli.Put(ctx, "foo", strings.Repeat("a", 1*1024*1024+5))
err == rpctypes.ErrRequestTooLarge
// client writes exceeding "MaxCallSendMsgSize" will be rejected from client-side
_, err = cli.Put(ctx, "foo", strings.Repeat("a", 5*1024*1024))
err.Error() == "rpc error: code = ResourceExhausted desc = grpc: trying to send message larger than max (5242890 vs. 2097152)"
// some writes under limits
for i := range []int{0,1,2,3,4} {
_, err = cli.Put(ctx, fmt.Sprintf("foo%d", i), strings.Repeat("a", 1*1024*1024-500))
if err != nil {
panic(err)
}
}
// client reads exceeding "MaxCallRecvMsgSize" will be rejected from client-side
_, err = cli.Get(ctx, "foo", clientv3.WithPrefix())
err.Error() == "rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5240509 vs. 3145728)"
```
**If not specified, client-side send limit defaults to 2 MiB (1.5 MiB + gRPC overhead bytes) and receive limit to `math.MaxInt32`**. Please see [clientv3 godoc](https://godoc.org/github.com/coreos/etcd/clientv3#Config) for more detail.
#### Change in raw gRPC client wrappers
3.2.12 or later changes the function signatures of `clientv3` gRPC client wrapper. This change was needed to support [custom `grpc.CallOption` on message size limits](https://github.com/coreos/etcd/pull/9047).
Before and after
```diff
-func NewKVFromKVClient(remote pb.KVClient) KV {
+func NewKVFromKVClient(remote pb.KVClient, c *Client) KV {
-func NewClusterFromClusterClient(remote pb.ClusterClient) Cluster {
+func NewClusterFromClusterClient(remote pb.ClusterClient, c *Client) Cluster {
-func NewLeaseFromLeaseClient(remote pb.LeaseClient, keepAliveTimeout time.Duration) Lease {
+func NewLeaseFromLeaseClient(remote pb.LeaseClient, c *Client, keepAliveTimeout time.Duration) Lease {
-func NewMaintenanceFromMaintenanceClient(remote pb.MaintenanceClient) Maintenance {
+func NewMaintenanceFromMaintenanceClient(remote pb.MaintenanceClient, c *Client) Maintenance {
-func NewWatchFromWatchClient(wc pb.WatchClient) Watcher {
+func NewWatchFromWatchClient(wc pb.WatchClient, c *Client) Watcher {
```
#### Change in `clientv3.Lease.TimeToLive` API
Previously, `clientv3.Lease.TimeToLive` API returned `lease.ErrLeaseNotFound` on non-existent lease ID. 3.2 instead returns TTL=-1 in its response and no error (see [#7305](https://github.com/coreos/etcd/pull/7305)).
Before
```go
// when leaseID does not exist
resp, err := TimeToLive(ctx, leaseID)
resp == nil
err == lease.ErrLeaseNotFound
```
After
```go
// when leaseID does not exist
resp, err := TimeToLive(ctx, leaseID)
resp.TTL == -1
err == nil
```
#### Change in `clientv3.NewFromConfigFile`
`clientv3.NewFromConfigFile` is moved to `yaml.NewConfig`.
Before
```go
import "github.com/coreos/etcd/clientv3"
clientv3.NewFromConfigFile
```
After
```go
import clientv3yaml "github.com/coreos/etcd/clientv3/yaml"
clientv3yaml.NewConfig
```
#### Change in `--listen-peer-urls` and `--listen-client-urls`
3.2 now rejects domains names for `--listen-peer-urls` and `--listen-client-urls` (3.1 only prints out warnings), since domain name is invalid for network interface binding. Make sure that those URLs are properly formated as `scheme://IP:port`.
See [issue #6336](https://github.com/coreos/etcd/issues/6336) for more contexts.
### Server upgrade checklists
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.2, the running cluster must be 3.1 or greater. If it's before 3.1, please [upgrade to 3.1](upgrade_3_1.md) before upgrading to 3.2.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl endpoint health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data](../op-guide/maintenance.md#snapshot-backup). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version. Please note that the `snapshot` command only backs up the v3 data. For v2 data, see [backing up v2 datastore](../v2/admin_guide.md#backing-up-the-datastore).
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.2. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
If the cluster is serving a v2 data set larger than 50MB, each newly upgraded member may take up to two minutes to catch up with the existing cluster. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and we'll be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.2, the cluster will be upgraded to v3.2, and downgrade from this completed state is **not possible**. If any single member is still v3.1, however, the cluster and its operations remains "v3.1", and it is possible from this mixed cluster state to return to using a v3.1 etcd binary on all members.
Please [backup the data directory](../op-guide/maintenance.md#snapshot-backup) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example shows how to upgrade a 3-member v3.1 ectd cluster running on a local machine.
#### 1. Check upgrade requirements
Is the cluster healthy and running v3.1.x?
```
$ ETCDCTL_API=3 etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 6.600684ms
localhost:22379 is healthy: successfully committed proposal: took = 8.540064ms
localhost:32379 is healthy: successfully committed proposal: took = 8.763432ms
$ curl http://localhost:2379/version
{"etcdserver":"3.1.7","etcdcluster":"3.1.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
2017-04-27 14:13:31.491746 I | raft: c89feb932daef420 [term 3] received MsgTimeoutNow from 6d4f535bae3ab960 and starts an election to get leadership.
2017-04-27 14:13:31.491769 I | raft: c89feb932daef420 became candidate at term 4
2017-04-27 14:13:31.491788 I | raft: c89feb932daef420 received MsgVoteResp from c89feb932daef420 at term 4
2017-04-27 14:13:31.491797 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 6d4f535bae3ab960 at term 4
2017-04-27 14:13:31.491805 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 9eda174c7df8a033 at term 4
2017-04-27 14:13:31.491815 I | raft: raft.node: c89feb932daef420 lost leader 6d4f535bae3ab960 at term 4
2017-04-27 14:13:31.524084 I | raft: c89feb932daef420 received MsgVoteResp from 6d4f535bae3ab960 at term 4
2017-04-27 14:13:31.524108 I | raft: c89feb932daef420 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
2017-04-27 14:13:31.524123 I | raft: c89feb932daef420 became leader at term 4
2017-04-27 14:13:31.524136 I | raft: raft.node: c89feb932daef420 elected leader c89feb932daef420 at term 4
2017-04-27 14:13:31.592650 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream MsgApp v2 reader)
2017-04-27 14:13:31.592825 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message reader)
2017-04-27 14:13:31.693275 E | rafthttp: failed to dial 6d4f535bae3ab960 on stream Message (dial tcp [::1]:2380: getsockopt: connection refused)
2017-04-27 14:13:31.693289 I | rafthttp: peer 6d4f535bae3ab960 became inactive
2017-04-27 14:13:31.936678 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message writer)
```
It's a good idea at this point to [backup the etcd data](../op-guide/maintenance.md#snapshot-backup) to provide a downgrade path should any problems occur:
```
$ etcdctl snapshot save backup.db
```
#### 3. Drop-in etcd v3.2 binary and start the new etcd process
The new v3.2 etcd will publish its information to the cluster:
```
2017-04-27 14:14:25.363225 I | etcdserver: published {Name:s1 ClientURLs:[http://localhost:2379]} to cluster a9ededbffcb1b1f1
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.2 etcd binary:
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:22379 is healthy: successfully committed proposal: took = 5.540129ms
localhost:32379 is healthy: successfully committed proposal: took = 7.321771ms
localhost:2379 is healthy: successfully committed proposal: took = 10.629901ms
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.2:
```
2017-04-27 14:15:17.071804 W | etcdserver: member c89feb932daef420 has a higher version 3.2.0
2017-04-27 14:15:21.073110 W | etcdserver: the local etcd version 3.1.7 is not up-to-date
2017-04-27 14:15:21.073142 W | etcdserver: member 6d4f535bae3ab960 has a higher version 3.2.0
2017-04-27 14:15:21.073157 W | etcdserver: the local etcd version 3.1.7 is not up-to-date
2017-04-27 14:15:21.073164 W | etcdserver: member c89feb932daef420 has a higher version 3.2.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.2 successfully:
```
2017-04-27 14:15:54.536901 N | etcdserver/membership: updated the cluster version from 3.1 to 3.2
2017-04-27 14:15:54.537035 I | etcdserver/api: enabled capabilities for version 3.2
```
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 2.312897ms
localhost:22379 is healthy: successfully committed proposal: took = 2.553476ms
localhost:32379 is healthy: successfully committed proposal: took = 2.517902ms
```
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,476 @@
## Upgrade etcd from 3.2 to 3.3
In the general case, upgrading from etcd 3.2 to 3.3 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v3.2 processes and replace them with etcd v3.3 processes
- after running all v3.3 processes, new features in v3.3 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade checklists
**NOTE:** When [migrating from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
Highlighted breaking changes in 3.3.
#### Change in `etcdserver.EtcdServer` struct
`etcdserver.EtcdServer` has changed the type of its member field `*etcdserver.ServerConfig` to `etcdserver.ServerConfig`. And `etcdserver.NewServer` now takes `etcdserver.ServerConfig`, instead of `*etcdserver.ServerConfig`.
Before and after (e.g. [k8s.io/kubernetes/test/e2e_node/services/etcd.go](https://github.com/kubernetes/kubernetes/blob/release-1.8/test/e2e_node/services/etcd.go#L50-L55))
```diff
import "github.com/coreos/etcd/etcdserver"
type EtcdServer struct {
*etcdserver.EtcdServer
- config *etcdserver.ServerConfig
+ config etcdserver.ServerConfig
}
func NewEtcd(dataDir string) *EtcdServer {
- config := &etcdserver.ServerConfig{
+ config := etcdserver.ServerConfig{
DataDir: dataDir,
...
}
return &EtcdServer{config: config}
}
func (e *EtcdServer) Start() error {
var err error
e.EtcdServer, err = etcdserver.NewServer(e.config)
...
```
#### Change in `embed.EtcdServer` struct
Field `LogOutput` is added to `embed.Config`:
```diff
package embed
type Config struct {
Debug bool `json:"debug"`
LogPkgLevels string `json:"log-package-levels"`
+ LogOutput string `json:"log-output"`
...
```
Before gRPC server warnings were logged in etcdserver.
```
WARNING: 2017/11/02 11:35:51 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp: operation was canceled"; Reconnecting to {localhost:2379 <nil>}
WARNING: 2017/11/02 11:35:51 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp: operation was canceled"; Reconnecting to {localhost:2379 <nil>}
```
From v3.3, gRPC server logs are disabled by default.
```go
import "github.com/coreos/etcd/embed"
cfg := &embed.Config{Debug: false}
cfg.SetupLogging()
```
Set `embed.Config.Debug` field to `true` to enable gRPC server logs.
#### Change in `/health` endpoint response
Previously, `[endpoint]:[client-port]/health` returned manually marshaled JSON value. 3.3 now defines [`etcdhttp.Health`](https://godoc.org/github.com/coreos/etcd/etcdserver/api/etcdhttp#Health) struct.
Note that in v3.3.0-rc.0, v3.3.0-rc.1, and v3.3.0-rc.2, `etcdhttp.Health` has boolean type `"health"` and `"errors"` fields. For backward compatibilities, we reverted `"health"` field to `string` type and removed `"errors"` field. Further health information will be provided in separate APIs.
```bash
$ curl http://localhost:2379/health
{"health":"true"}
```
#### Change in gRPC gateway HTTP endpoints (replaced `/v3alpha` with `/v3beta`)
Before
```bash
curl -L http://localhost:2379/v3alpha/kv/put \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}'
```
After
```bash
curl -L http://localhost:2379/v3beta/kv/put \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}'
```
Requests to `/v3alpha` endpoints will redirect to `/v3beta`, and `/v3alpha` will be removed in 3.4 release.
#### Change in maximum request size limits
3.3 now allows custom request size limits for both server and **client side**. In previous versions(v3.2.10, v3.2.11), client response size was limited to only 4 MiB.
Server-side request limits can be configured with `--max-request-bytes` flag:
```bash
# limits request size to 1.5 KiB
etcd --max-request-bytes 1536
# client writes exceeding 1.5 KiB will be rejected
etcdctl put foo [LARGE VALUE...]
# etcdserver: request is too large
```
Or configure `embed.Config.MaxRequestBytes` field:
```go
import "github.com/coreos/etcd/embed"
import "github.com/coreos/etcd/etcdserver/api/v3rpc/rpctypes"
// limit requests to 5 MiB
cfg := embed.NewConfig()
cfg.MaxRequestBytes = 5 * 1024 * 1024
// client writes exceeding 5 MiB will be rejected
_, err := cli.Put(ctx, "foo", [LARGE VALUE...])
err == rpctypes.ErrRequestTooLarge
```
**If not specified, server-side limit defaults to 1.5 MiB**.
Client-side request limits must be configured based on server-side limits.
```bash
# limits request size to 1 MiB
etcd --max-request-bytes 1048576
```
```go
import "github.com/coreos/etcd/clientv3"
cli, _ := clientv3.New(clientv3.Config{
Endpoints: []string{"127.0.0.1:2379"},
MaxCallSendMsgSize: 2 * 1024 * 1024,
MaxCallRecvMsgSize: 3 * 1024 * 1024,
})
// client writes exceeding "--max-request-bytes" will be rejected from etcd server
_, err := cli.Put(ctx, "foo", strings.Repeat("a", 1*1024*1024+5))
err == rpctypes.ErrRequestTooLarge
// client writes exceeding "MaxCallSendMsgSize" will be rejected from client-side
_, err = cli.Put(ctx, "foo", strings.Repeat("a", 5*1024*1024))
err.Error() == "rpc error: code = ResourceExhausted desc = grpc: trying to send message larger than max (5242890 vs. 2097152)"
// some writes under limits
for i := range []int{0,1,2,3,4} {
_, err = cli.Put(ctx, fmt.Sprintf("foo%d", i), strings.Repeat("a", 1*1024*1024-500))
if err != nil {
panic(err)
}
}
// client reads exceeding "MaxCallRecvMsgSize" will be rejected from client-side
_, err = cli.Get(ctx, "foo", clientv3.WithPrefix())
err.Error() == "rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5240509 vs. 3145728)"
```
**If not specified, client-side send limit defaults to 2 MiB (1.5 MiB + gRPC overhead bytes) and receive limit to `math.MaxInt32`**. Please see [clientv3 godoc](https://godoc.org/github.com/coreos/etcd/clientv3#Config) for more detail.
#### Change in raw gRPC client wrappers
3.3 changes the function signatures of `clientv3` gRPC client wrapper. This change was needed to support [custom `grpc.CallOption` on message size limits](https://github.com/coreos/etcd/pull/9047).
Before and after
```diff
-func NewKVFromKVClient(remote pb.KVClient) KV {
+func NewKVFromKVClient(remote pb.KVClient, c *Client) KV {
-func NewClusterFromClusterClient(remote pb.ClusterClient) Cluster {
+func NewClusterFromClusterClient(remote pb.ClusterClient, c *Client) Cluster {
-func NewLeaseFromLeaseClient(remote pb.LeaseClient, keepAliveTimeout time.Duration) Lease {
+func NewLeaseFromLeaseClient(remote pb.LeaseClient, c *Client, keepAliveTimeout time.Duration) Lease {
-func NewMaintenanceFromMaintenanceClient(remote pb.MaintenanceClient) Maintenance {
+func NewMaintenanceFromMaintenanceClient(remote pb.MaintenanceClient, c *Client) Maintenance {
-func NewWatchFromWatchClient(wc pb.WatchClient) Watcher {
+func NewWatchFromWatchClient(wc pb.WatchClient, c *Client) Watcher {
```
#### Change in clientv3 `Snapshot` API error type
Previously, clientv3 `Snapshot` API returned raw [`grpc/*status.statusError`] type error. v3.3 now translates those errors to corresponding public error types, to be consistent with other APIs.
Before
```go
import "context"
// reading snapshot with canceled context should error out
ctx, cancel := context.WithCancel(context.Background())
rc, _ := cli.Snapshot(ctx)
cancel()
_, err := io.Copy(f, rc)
err.Error() == "rpc error: code = Canceled desc = context canceled"
// reading snapshot with deadline exceeded should error out
ctx, cancel = context.WithTimeout(context.Background(), time.Second)
defer cancel()
rc, _ = cli.Snapshot(ctx)
time.Sleep(2 * time.Second)
_, err = io.Copy(f, rc)
err.Error() == "rpc error: code = DeadlineExceeded desc = context deadline exceeded"
```
After
```go
import "context"
// reading snapshot with canceled context should error out
ctx, cancel := context.WithCancel(context.Background())
rc, _ := cli.Snapshot(ctx)
cancel()
_, err := io.Copy(f, rc)
err == context.Canceled
// reading snapshot with deadline exceeded should error out
ctx, cancel = context.WithTimeout(context.Background(), time.Second)
defer cancel()
rc, _ = cli.Snapshot(ctx)
time.Sleep(2 * time.Second)
_, err = io.Copy(f, rc)
err == context.DeadlineExceeded
```
#### Change in `etcdctl lease timetolive` command output
Previously, `lease timetolive LEASE_ID` command on expired lease prints `-1s` for remaining seconds. 3.3 now outputs clearer messages.
Before
```bash
lease 2d8257079fa1bc0c granted with TTL(0s), remaining(-1s)
```
After
```bash
lease 2d8257079fa1bc0c already expired
```
#### Change in `golang.org/x/net/context` imports
`clientv3` has deprecated `golang.org/x/net/context`. If a project vendors `golang.org/x/net/context` in other code (e.g. etcd generated protocol buffer code) and imports `github.com/coreos/etcd/clientv3`, it requires Go 1.9+ to compile.
Before
```go
import "golang.org/x/net/context"
cli.Put(context.Background(), "f", "v")
```
After
```go
import "context"
cli.Put(context.Background(), "f", "v")
```
#### Change in gRPC dependency
3.3 now requires [grpc/grpc-go](https://github.com/grpc/grpc-go/releases) `v1.7.5`.
##### Deprecate `grpclog.Logger`
`grpclog.Logger` has been deprecated in favor of [`grpclog.LoggerV2`](https://github.com/grpc/grpc-go/blob/master/grpclog/loggerv2.go). `clientv3.Logger` is now `grpclog.LoggerV2`.
Before
```go
import "github.com/coreos/etcd/clientv3"
clientv3.SetLogger(log.New(os.Stderr, "grpc: ", 0))
```
After
```go
import "github.com/coreos/etcd/clientv3"
import "google.golang.org/grpc/grpclog"
clientv3.SetLogger(grpclog.NewLoggerV2(os.Stderr, os.Stderr, os.Stderr))
// log.New above cannot be used (not implement grpclog.LoggerV2 interface)
```
##### Deprecate `grpc.ErrClientConnTimeout`
Previously, `grpc.ErrClientConnTimeout` error is returned on client dial time-outs. 3.3 instead returns `context.DeadlineExceeded` (see [#8504](https://github.com/coreos/etcd/issues/8504)).
Before
```go
// expect dial time-out on ipv4 blackhole
_, err := clientv3.New(clientv3.Config{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: 2 * time.Second
})
if err == grpc.ErrClientConnTimeout {
// handle errors
}
```
After
```go
_, err := clientv3.New(clientv3.Config{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: 2 * time.Second
})
if err == context.DeadlineExceeded {
// handle errors
}
```
#### Change in official container registry
etcd now uses [`gcr.io/etcd-development/etcd`](https://gcr.io/etcd-development/etcd) as a primary container registry, and [`quay.io/coreos/etcd`](https://quay.io/coreos/etcd) as secondary.
Before
```bash
docker pull quay.io/coreos/etcd:v3.2.5
```
After
```bash
docker pull gcr.io/etcd-development/etcd:v3.3.0
```
### Server upgrade checklists
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.3, the running cluster must be 3.2 or greater. If it's before 3.2, please [upgrade to 3.2](upgrade_3_2.md) before upgrading to 3.3.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl endpoint health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data](../op-guide/maintenance.md#snapshot-backup). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version. Please note that the `snapshot` command only backs up the v3 data. For v2 data, see [backing up v2 datastore](../v2/admin_guide.md#backing-up-the-datastore).
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.3. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
If the cluster is serving a v2 data set larger than 50MB, each newly upgraded member may take up to two minutes to catch up with the existing cluster. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and we'll be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.3, the cluster will be upgraded to v3.3, and downgrade from this completed state is **not possible**. If any single member is still v3.2, however, the cluster and its operations remains "v3.2", and it is possible from this mixed cluster state to return to using a v3.2 etcd binary on all members.
Please [backup the data directory](../op-guide/maintenance.md#snapshot-backup) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example shows how to upgrade a 3-member v3.2 ectd cluster running on a local machine.
#### 1. Check upgrade requirements
Is the cluster healthy and running v3.2.x?
```
$ ETCDCTL_API=3 etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 6.600684ms
localhost:22379 is healthy: successfully committed proposal: took = 8.540064ms
localhost:32379 is healthy: successfully committed proposal: took = 8.763432ms
$ curl http://localhost:2379/version
{"etcdserver":"3.2.7","etcdcluster":"3.2.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
14:13:31.491746 I | raft: c89feb932daef420 [term 3] received MsgTimeoutNow from 6d4f535bae3ab960 and starts an election to get leadership.
14:13:31.491769 I | raft: c89feb932daef420 became candidate at term 4
14:13:31.491788 I | raft: c89feb932daef420 received MsgVoteResp from c89feb932daef420 at term 4
14:13:31.491797 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 6d4f535bae3ab960 at term 4
14:13:31.491805 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 9eda174c7df8a033 at term 4
14:13:31.491815 I | raft: raft.node: c89feb932daef420 lost leader 6d4f535bae3ab960 at term 4
14:13:31.524084 I | raft: c89feb932daef420 received MsgVoteResp from 6d4f535bae3ab960 at term 4
14:13:31.524108 I | raft: c89feb932daef420 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
14:13:31.524123 I | raft: c89feb932daef420 became leader at term 4
14:13:31.524136 I | raft: raft.node: c89feb932daef420 elected leader c89feb932daef420 at term 4
14:13:31.592650 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream MsgApp v2 reader)
14:13:31.592825 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message reader)
14:13:31.693275 E | rafthttp: failed to dial 6d4f535bae3ab960 on stream Message (dial tcp [::1]:2380: getsockopt: connection refused)
14:13:31.693289 I | rafthttp: peer 6d4f535bae3ab960 became inactive
14:13:31.936678 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message writer)
```
It's a good idea at this point to [backup the etcd data](../op-guide/maintenance.md#snapshot-backup) to provide a downgrade path should any problems occur:
```
$ etcdctl snapshot save backup.db
```
#### 3. Drop-in etcd v3.3 binary and start the new etcd process
The new v3.3 etcd will publish its information to the cluster:
```
14:14:25.363225 I | etcdserver: published {Name:s1 ClientURLs:[http://localhost:2379]} to cluster a9ededbffcb1b1f1
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.3 etcd binary:
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:22379 is healthy: successfully committed proposal: took = 5.540129ms
localhost:32379 is healthy: successfully committed proposal: took = 7.321771ms
localhost:2379 is healthy: successfully committed proposal: took = 10.629901ms
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.3:
```
14:15:17.071804 W | etcdserver: member c89feb932daef420 has a higher version 3.3.0
14:15:21.073110 W | etcdserver: the local etcd version 3.2.7 is not up-to-date
14:15:21.073142 W | etcdserver: member 6d4f535bae3ab960 has a higher version 3.3.0
14:15:21.073157 W | etcdserver: the local etcd version 3.2.7 is not up-to-date
14:15:21.073164 W | etcdserver: member c89feb932daef420 has a higher version 3.3.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.3 successfully:
```
14:15:54.536901 N | etcdserver/membership: updated the cluster version from 3.2 to 3.3
14:15:54.537035 I | etcdserver/api: enabled capabilities for version 3.3
```
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 2.312897ms
localhost:22379 is healthy: successfully committed proposal: took = 2.553476ms
localhost:32379 is healthy: successfully committed proposal: took = 2.517902ms
```
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,171 @@
## Upgrade etcd from 3.3 to 3.4
In the general case, upgrading from etcd 3.3 to 3.4 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v3.3 processes and replace them with etcd v3.4 processes
- after running all v3.4 processes, new features in v3.4 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade checklists
**NOTE:** When [migrating from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
Highlighted breaking changes in 3.4.
#### Change in `etcd` flags
`--ca-file` and `--peer-ca-file` flags are deprecated; they have been deprecated since v2.1.
```diff
-etcd --ca-file ca-client.crt
+etcd --trusted-ca-file ca-client.crt
```
```diff
-etcd --peer-ca-file ca-peer.crt
+etcd --peer-trusted-ca-file ca-peer.crt
```
#### Change in ``pkg/transport`
Deprecated `pkg/transport.TLSInfo.CAFile` field.
```diff
import "github.com/coreos/etcd/pkg/transport"
tlsInfo := transport.TLSInfo{
CertFile: "/tmp/test-certs/test.pem",
KeyFile: "/tmp/test-certs/test-key.pem",
- CAFile: "/tmp/test-certs/trusted-ca.pem",
+ TrustedCAFile: "/tmp/test-certs/trusted-ca.pem",
}
tlsConfig, err := tlsInfo.ClientConfig()
if err != nil {
panic(err)
}
```
### Server upgrade checklists
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.4, the running cluster must be 3.3 or greater. If it's before 3.3, please [upgrade to 3.3](upgrade_3_3.md) before upgrading to 3.4.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl endpoint health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data](../op-guide/maintenance.md#snapshot-backup). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version. Please note that the `snapshot` command only backs up the v3 data. For v2 data, see [backing up v2 datastore](../v2/admin_guide.md#backing-up-the-datastore).
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.4. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
If the cluster is serving a v2 data set larger than 50MB, each newly upgraded member may take up to two minutes to catch up with the existing cluster. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and we'll be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.4, the cluster will be upgraded to v3.4, and downgrade from this completed state is **not possible**. If any single member is still v3.3, however, the cluster and its operations remains "v3.3", and it is possible from this mixed cluster state to return to using a v3.3 etcd binary on all members.
Please [backup the data directory](../op-guide/maintenance.md#snapshot-backup) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example shows how to upgrade a 3-member v3.3 ectd cluster running on a local machine.
#### 1. Check upgrade requirements
Is the cluster healthy and running v3.3.x?
```
$ ETCDCTL_API=3 etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 6.600684ms
localhost:22379 is healthy: successfully committed proposal: took = 8.540064ms
localhost:32379 is healthy: successfully committed proposal: took = 8.763432ms
$ curl http://localhost:2379/version
{"etcdserver":"3.3.0","etcdcluster":"3.3.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
14:13:31.491746 I | raft: c89feb932daef420 [term 3] received MsgTimeoutNow from 6d4f535bae3ab960 and starts an election to get leadership.
14:13:31.491769 I | raft: c89feb932daef420 became candidate at term 4
14:13:31.491788 I | raft: c89feb932daef420 received MsgVoteResp from c89feb932daef420 at term 4
14:13:31.491797 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 6d4f535bae3ab960 at term 4
14:13:31.491805 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 9eda174c7df8a033 at term 4
14:13:31.491815 I | raft: raft.node: c89feb932daef420 lost leader 6d4f535bae3ab960 at term 4
14:13:31.524084 I | raft: c89feb932daef420 received MsgVoteResp from 6d4f535bae3ab960 at term 4
14:13:31.524108 I | raft: c89feb932daef420 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
14:13:31.524123 I | raft: c89feb932daef420 became leader at term 4
14:13:31.524136 I | raft: raft.node: c89feb932daef420 elected leader c89feb932daef420 at term 4
14:13:31.592650 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream MsgApp v2 reader)
14:13:31.592825 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message reader)
14:13:31.693275 E | rafthttp: failed to dial 6d4f535bae3ab960 on stream Message (dial tcp [::1]:2380: getsockopt: connection refused)
14:13:31.693289 I | rafthttp: peer 6d4f535bae3ab960 became inactive
14:13:31.936678 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message writer)
```
It's a good idea at this point to [backup the etcd data](../op-guide/maintenance.md#snapshot-backup) to provide a downgrade path should any problems occur:
```
$ etcdctl snapshot save backup.db
```
#### 3. Drop-in etcd v3.4 binary and start the new etcd process
The new v3.4 etcd will publish its information to the cluster:
```
14:14:25.363225 I | etcdserver: published {Name:s1 ClientURLs:[http://localhost:2379]} to cluster a9ededbffcb1b1f1
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.4 etcd binary:
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:22379 is healthy: successfully committed proposal: took = 5.540129ms
localhost:32379 is healthy: successfully committed proposal: took = 7.321771ms
localhost:2379 is healthy: successfully committed proposal: took = 10.629901ms
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.4:
```
14:15:17.071804 W | etcdserver: member c89feb932daef420 has a higher version 3.4.0
14:15:21.073110 W | etcdserver: the local etcd version 3.3.0 is not up-to-date
14:15:21.073142 W | etcdserver: member 6d4f535bae3ab960 has a higher version 3.4.0
14:15:21.073157 W | etcdserver: the local etcd version 3.3.0 is not up-to-date
14:15:21.073164 W | etcdserver: member c89feb932daef420 has a higher version 3.4.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.4 successfully:
```
14:15:54.536901 N | etcdserver/membership: updated the cluster version from 3.3 to 3.4
14:15:54.537035 I | etcdserver/api: enabled capabilities for version 3.4
```
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 2.312897ms
localhost:22379 is healthy: successfully committed proposal: took = 2.553476ms
localhost:32379 is healthy: successfully committed proposal: took = 2.517902ms
```
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,19 @@
# Upgrading etcd clusters and applications
This section contains documents specific to upgrading etcd clusters and applications.
## Moving from etcd API v2 to API v3
* [Migrate applications from using API v2 to API v3][migrate-apps]
## Upgrading an etcd v3.x cluster
* [Upgrade etcd from 3.0 to 3.1][upgrade-3-1]
* [Upgrade etcd from 3.1 to 3.2][upgrade-3-2]
## Upgrading from etcd v2.3
* [Upgrade a v2.3 cluster to v3.0][upgrade-cluster]
[migrate-apps]: ../op-guide/v2-migration.md
[upgrade-cluster]: upgrade_3_0.md
[upgrade-3-1]: upgrade_3_1.md
[upgrade-3-2]: upgrade_3_2.md

View File

@ -67,13 +67,13 @@ You have successfully started an etcd and written a key to the store.
The [official etcd ports][iana-ports] are 2379 for client requests, and 2380 for peer communication. To maintain compatibility, some etcd configuration and documentation continues to refer to the legacy ports 4001 and 7001, but all new etcd use and discussion should adopt the IANA-assigned ports. The legacy ports 4001 and 7001 will be fully deprecated, and support for their use removed, in future etcd releases.
[iana-ports]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=etcd
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
### Running local etcd cluster
First install [goreman](https://github.com/mattn/goreman), which manages Procfile-based applications.
Our [Procfile script](./Procfile) will set up a local example cluster. You can start it with:
Our [Procfile script](../../V2Procfile) will set up a local example cluster. You can start it with:
```sh
goreman start
@ -162,4 +162,4 @@ Currently only the amd64 architecture is officially supported by `etcd`.
### License
etcd is under the Apache 2.0 license. See the [LICENSE](LICENSE) file for details.
etcd is under the Apache 2.0 license. See the [LICENSE](../../LICENSE) file for details.

View File

@ -216,7 +216,7 @@ To recover from such scenarios, etcd provides functionality to backup and restor
#### Backing up the datastore
**NB:** Windows users must stop etcd before running the backup command.
**Note:** Windows users must stop etcd before running the backup command.
The first step of the recovery is to backup the data directory and wal directory, if stored separately, on a functioning etcd node. To do this, use the `etcdctl backup` command, passing in the original data (and wal) directory used by etcd. For example:
@ -262,7 +262,9 @@ Once you have verified that etcd has started successfully, shut it down and move
Now that the node is running successfully, [change its advertised peer URLs][update-a-member], as the `--force-new-cluster` option has set the peer URL to the default listening on localhost.
You can then add more nodes to the cluster and restore resiliency. See the [add a new member][add-a-member] guide for more details. **NB:** If you are trying to restore your cluster using old failed etcd nodes, please make sure you have stopped old etcd instances and removed their old data directories specified by the data-dir configuration parameter.
You can then add more nodes to the cluster and restore resiliency. See the [add a new member][add-a-member] guide for more details.
**Note:** If you are trying to restore your cluster using old failed etcd nodes, please make sure you have stopped old etcd instances and removed their old data directories specified by the data-dir configuration parameter.
### Client Request Timeout

View File

@ -18,7 +18,7 @@ A keys lifetime spans a generation. Each key may have one or multiple generat
### Physical View
etcd stores the physical data as key-value pairs in a persistent [b+tree][b+tree]. Each revision of the stores state only contains the delta from its previous revision to be efficient. A single revision may correspond to multiple keys in the tree.
etcd stores the physical data as key-value pairs in a persistent [b+tree][b+tree]. Each revision of the stores state only contains the delta from its previous revision to be efficient. A single revision may correspond to multiple keys in the tree.
The key of key-value pair is a 3-tuple (major, sub, type). Major is the store revision holding the key. Sub differentiates among keys within the same revision. Type is an optional suffix for special value (e.g., `t` if the value contains a tombstone). The value of the key-value pair contains the modification from previous revision, thus one delta from previous revision. The b+tree is ordered by key in lexical byte-order. Ranged lookups over revision deltas are fast; this enables quickly finding modifications from one specific revision to another. Compaction removes out-of-date keys-value pairs.
@ -73,7 +73,7 @@ Any completed operations are durable. All accessible data is also durable data.
#### Linearizability
Linearizability (also known as Atomic Consistency or External Consistency) is a consistency level between strict consistency and sequential consistency.
Linearizability (also known as Atomic Consistency or External Consistency) is a consistency level between strict consistency and sequential consistency.
For linearizability, suppose each operation receives a timestamp from a loosely synchronized global clock. Operations are linearized if and only if they always complete as though they were executed in a sequential order and each operation appears to complete in the order specified by the program. Likewise, if an operations timestamp precedes another, that operation must also precede the other operation in the sequence.
@ -83,10 +83,10 @@ etcd does not ensure linearizability for watch operations. Users are expected to
etcd ensures linearizability for all other operations by default. Linearizability comes with a cost, however, because linearized requests must go through the Raft consensus process. To obtain lower latencies and higher throughput for read requests, clients can configure a requests consistency mode to `serializable`, which may access stale data with respect to quorum, but removes the performance penalty of linearized accesses' reliance on live consensus.
[persistent-ds]: [https://en.wikipedia.org/wiki/Persistent_data_structure]
[btree]: [https://en.wikipedia.org/wiki/B-tree]
[b+tree]: [https://en.wikipedia.org/wiki/B%2B_tree]
[seq_consistency]: [https://en.wikipedia.org/wiki/Consistency_model#Sequential_consistency]
[strict_consistency]: [https://en.wikipedia.org/wiki/Consistency_model#Strict_consistency]
[serializable_isolation]: [https://en.wikipedia.org/wiki/Isolation_(database_systems)#Serializable]
[Linearizability]: [#Linearizability]
[persistent-ds]: https://en.wikipedia.org/wiki/Persistent_data_structure
[btree]: https://en.wikipedia.org/wiki/B-tree
[b+tree]: https://en.wikipedia.org/wiki/B%2B_tree
[seq_consistency]: https://en.wikipedia.org/wiki/Consistency_model#Sequential_consistency
[strict_consistency]: https://en.wikipedia.org/wiki/Consistency_model#Strict_consistency
[serializable_isolation]: https://en.wikipedia.org/wiki/Isolation_(database_systems)#Serializable
[Linearizability]: #linearizability

View File

@ -32,7 +32,7 @@ The consistent flag for read operations is removed in etcd 2.0.0. The normal rea
The read consistency guarantees are:
The consistent read guarantees the sequential consistency within one client that talks to one etcd server. Read/Write from one client to one etcd member should be observed in order. If one client write a value to an etcd server successfully, it should be able to get the value out of the server immediately.
The consistent read guarantees the sequential consistency within one client that talks to one etcd server. Read/Write from one client to one etcd member should be observed in order. If one client write a value to an etcd server successfully, it should be able to get the value out of the server immediately.
Each etcd member will proxy the request to leader and only return the result to user after the result is applied on the local member. Thus after the write succeed, the user is guaranteed to see the value on the member it sent the request to.
@ -56,6 +56,7 @@ Proxy mode in 2.0 will provide similar functionality, and with improved control
## Discovery Service
A size key needs to be provided inside a [discovery token][discoverytoken].
[discoverytoken]: clustering.md#custom-etcd-discovery-service
## HTTP Admin API

View File

@ -49,4 +49,4 @@ Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send r
| 256 | 256 | all servers | 3061 | 119.3 |
[boom]: https://github.com/rakyll/boom
[hack-benchmark]: /hack/benchmark/
[hack-benchmark]: ../../../hack/benchmark/

View File

@ -24,7 +24,7 @@ Go OS/Arch: linux/amd64
## Testing
Bootstrap another machine, outside of the etcd cluster, and run the [`boom` HTTP benchmark tool](https://github.com/rakyll/boom) with a connection reuse patch to send requests to each etcd cluster member. See the [benchmark instructions](../../hack/benchmark/) for the patch and the steps to reproduce our procedures.
Bootstrap another machine, outside of the etcd cluster, and run the [`boom` HTTP benchmark tool][boom] with a connection reuse patch to send requests to each etcd cluster member. See the [benchmark instructions][hack] for the patch and the steps to reproduce our procedures.
The performance is calulated through results of 100 benchmark rounds.
@ -66,4 +66,7 @@ The performance is calulated through results of 100 benchmark rounds.
- Write QPS to cluster leaders seems to be increased by a small margin. This is because the main loop and entry apply loops were decoupled in the etcd raft logic, eliminating several blocks between them.
- Write QPS to all members seems to be increased by a significant margin, because followers now receive the latest commit index sooner, and commit proposals more quickly.
- Write QPS to all members seems to be increased by a significant margin, because followers now receive the latest commit index sooner, and commit proposals more quickly.
[boom]: https://github.com/rakyll/boom
[hack]: ../../../hack/benchmark/

View File

@ -69,4 +69,4 @@ Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send r
[boom]: https://github.com/rakyll/boom
[c7146bd5]: https://github.com/coreos/etcd/commits/c7146bd5f2c73716091262edc638401bb8229144
[etcd-2.1-benchmark]: etcd-2-1-0-alpha-benchmarks.md
[hack-benchmark]: /hack/benchmark/
[hack-benchmark]: ../../../hack/benchmark/

View File

@ -39,4 +39,4 @@ The performance is nearly the same as the one with empty server handler.
The performance with empty server handler is not affected by one put. So the
performance downgrade should be caused by storage package.
[etcd-v3-benchmark]: /tools/benchmark/
[etcd-v3-benchmark]: ../../../tools/benchmark/

View File

@ -423,7 +423,7 @@ To make understanding this feature easier, we changed the naming of some flags,
|-peers |none |Deprecated. The --initial-cluster flag provides a similar concept with different semantics. Please read this guide on cluster startup.|
|-peers-file |none |Deprecated. The --initial-cluster flag provides a similar concept with different semantics. Please read this guide on cluster startup.|
[client]: /client
[client]: ../../client
[client-discoverer]: https://godoc.org/github.com/coreos/etcd/client#Discoverer
[conf-adv-client]: configuration.md#-advertise-client-urls
[conf-listen-client]: configuration.md#-listen-client-urls

View File

@ -234,7 +234,7 @@ The security flags help to [build a secure etcd cluster][security].
+ env variable: ETCD_DEBUG
### --log-package-levels
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ default: none (INFO for all packages)
+ env variable: ETCD_LOG_PACKAGE_LEVELS
@ -272,7 +272,7 @@ Follow the instructions when using these flags.
[build-cluster]: clustering.md#static
[reconfig]: runtime-configuration.md
[discovery]: clustering.md#discovery
[iana-ports]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=etcd
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
[proxy]: proxy.md
[reconfig]: runtime-configuration.md
[restore]: admin_guide.md#restoring-a-backup

View File

@ -112,7 +112,6 @@
- [mattn/etcdenv](https://github.com/mattn/etcdenv) - "env" shebang with etcd integration
- [kelseyhightower/confd](https://github.com/kelseyhightower/confd) - Manage local app config files using templates and data from etcd
- [configdb](https://git.autistici.org/ai/configdb/tree/master) - A REST relational abstraction on top of arbitrary database backends, aimed at storing configs and inventories.
- [scrz](https://github.com/scrz/scrz) - Container manager, stores configuration in etcd.
- [fleet](https://github.com/coreos/fleet) - Distributed init system
- [kubernetes/kubernetes](https://github.com/kubernetes/kubernetes) - Container cluster manager introduced by Google.
- [mailgun/vulcand](https://github.com/mailgun/vulcand) - HTTP proxy that uses etcd as a configuration backend.

View File

@ -1,6 +1,6 @@
# Reporting Bugs
If you find bugs or documentation mistakes in the etcd project, please let us know by [opening an issue][issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
If you find bugs or documentation mistakes in the etcd project, please let us know by [opening an issue][etcd-issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
To make your bug report accurate and easy to understand, please try to create bug reports that are:

View File

@ -7,25 +7,25 @@ To prove out the design of the v3 API the team has also built [a number of examp
# Design
1. Flatten binary key-value space
2. Keep the event history until compaction
- access to old version of keys
- user controlled history compaction
3. Support range query
- Pagination support with limit argument
- Support consistency guarantee across multiple range queries
4. Replace TTL key with Lease
- more efficient/ low cost keep alive
- a logical group of TTL keys
5. Replace CAS/CAD with multi-object Txn
- MUCH MORE powerful and flexible
6. Support efficient watching with multiple ranges
7. RPC API supports the completed set of APIs.
7. RPC API supports the completed set of APIs.
- more efficient than JSON/HTTP
- additional txn/lease support
@ -56,7 +56,7 @@ the size in the future a little bit or make it configurable.
// A put is always successful
Put( PutRequest { key = foo, value = bar } )
PutResponse {
PutResponse {
cluster_id = 0x1000,
member_id = 0x1,
revision = 1,
@ -119,7 +119,7 @@ RangeResponse {
Txn(TxnRequest {
// mod_revision of foo0 is equal to 1, mod_revision of foo1 is greater than 1
compare = {
{compareType = equal, key = foo0, mod_revision = 1},
{compareType = equal, key = foo0, mod_revision = 1},
{compareType = greater, key = foo1, mod_revision = 1}}
},
// if the comparison succeeds, put foo2 = bar2
@ -156,7 +156,7 @@ Watch( WatchRequest{
end_revision = 10000,
// server decided notification frequency
progress_notification = true,
}
}
… // this can be a watch request stream
)
@ -176,7 +176,7 @@ WatchResponse {
},
}
// a notification at 2000
WatchResponse {
cluster_id = 0x1000,
@ -185,9 +185,9 @@ WatchResponse {
raft_term = 0x1,
// nil event as notification
}
// put (foo0=bar3000) event at 3000
WatchResponse {
cluster_id = 0x1000,
@ -204,8 +204,8 @@ WatchResponse {
},
}
```
[api-protobuf]: https://github.com/coreos/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto
[kv-protobuf]: https://github.com/coreos/etcd/blob/master/storage/storagepb/kv.proto
[api-protobuf]: https://github.com/coreos/etcd/blob/release-2.3/etcdserver/etcdserverpb/rpc.proto
[kv-protobuf]: https://github.com/coreos/etcd/blob/release-2.3/storage/storagepb/kv.proto

View File

@ -188,6 +188,6 @@ Make sure that you sign your certificates with a Subject Name your member's publ
If you need your certificate to be signed for your member's FQDN in its Subject Name then you could use Subject Alternative Names (short IP SANs) to add your IP address. The `etcd-ca` tool provides `--domain=` option for its `new-cert` command, and openssl can make [it][alt-name] too.
[cfssl]: https://github.com/cloudflare/cfssl
[tls-setup]: /hack/tls-setup
[tls-setup]: ../../hack/tls-setup
[tls-guide]: https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md
[alt-name]: http://wiki.cacert.org/FAQ/subjectAltName

91
NEWS
View File

@ -1,54 +1,81 @@
etcd v3.0.11 (2016-10-07)
- server returns previous key-value (optional)
- clientv3 WithPrevKV option
- v3 etcdctl prev-kv flag
etcd v3.1.0 (2017-01-20)
- faster linearizable reads (implements Raft read-index)
- automatic leadership transfer when leader steps down
- etcd uses default route IP if advertise URL is not given
- cluster rejects removing members if quorum will be lost
- SRV records (e.g., infra1.example.com) must match the discovery domain
(i.e., example.com) if no custom certificate authority is given
- TLSConfig ServerName is ignored with user-provided certificates
for backwards compatibility; to be deprecated in 3.2
- discovery now has upper limit for waiting on retries
- etcd flags
- --strict-reconfig-check flag is set by default
- add --log-output flag
- add --metrics flag
- v3 authentication API is now stable
- v3 client
- add SetEndpoints method; update endpoints at runtime
- add Sync method; auto-update endpoints at runtime
- add Lease TimeToLive API; fetch lease information
- replace Config.Logger field with global logger
- Get API responses are sorted in ascending order by default
- v3 etcdctl
- add lease timetolive command
- add --print-value-only flag to get command
- add --dest-prefix flag to make-mirror command
- command get responses are sorted in ascending order by default
- recipes now conform to sessions defined in clientv3/concurrency
- ACI has symlinks to /usr/local/bin/etcd*
- warn on binding listeners through domain names; to be deprecated in 3.2
- experimental gRPC proxy feature
etcd v3.0.16 (2017-01-13)
etcd v3.0.15 (2016-11-11)
- fix cancel watch request with wrong range end
etcd v3.0.14 (2016-11-04)
- v3 etcdctl migrate command now supports --no-ttl flag to discard keys on transform
etcd v3.0.13 (2016-10-24)
etcd v3.0.12 (2016-10-07)
etcd v3.0.11 (2016-10-07)
- server returns previous key-value (optional)
- clientv3 WithPrevKV option
- v3 etcdctl put,watch,del --prev-kv flag
etcd v3.0.10 (2016-09-23)
etcd v3.0.9 (2016-09-15)
- warn on domain names on listen URLs (v3.2 will reject domain names)
- warn on domain names on listen URLs (v3.2 will reject domain names)
etcd v3.0.8 (2016-09-09)
- allow only IP addresses in listen URLs (domain names are rejected)
- allow only IP addresses in listen URLs (domain names are rejected)
etcd v3.0.7 (2016-08-31)
- SRV records only allow A records (RFC 2052)
- SRV records only allow A records (RFC 2052)
etcd v3.0.6 (2016-08-19)
etcd v3.0.5 (2016-08-19)
- SRV records (e.g., infra1.example.com) must match the discovery domain
(i.e., example.com) when using the default certificate authority.
- SRV records (e.g., infra1.example.com) must match the discovery domain
(i.e., example.com) if no custom certificate authority is given
etcd v3.0.4 (2016-07-27)
- v2 auth can now use common name from TLS certificate when --client-cert-auth is enabled
- v2 etcdctl ls command now supports --output=json
- Add /var/lib/etcd directory to etcd official Docker image
- v2 auth can now use common name from TLS certificate when --client-cert-auth is enabled
- v2 etcdctl ls command now supports --output=json
- Add /var/lib/etcd directory to etcd official Docker image
etcd v3.0.3 (2016-07-15)
- Revert Dockerfile to use CMD, instead of ENTRYPOINT, to support etcdctl run
- Docker commands for v3.0.2 won't work without specifying executable binary paths
- v3 etcdctl default endpoints are now 127.0.0.1:2379
- Revert Dockerfile to use CMD, instead of ENTRYPOINT, to support etcdctl run
- Docker commands for v3.0.2 won't work without specifying executable binary paths
- v3 etcdctl default endpoints are now 127.0.0.1:2379
etcd v3.0.2 (2016-07-08)
- Dockerfile uses ENTRYPOINT, instead of CMD, to run etcd without binary path specified
- Dockerfile uses ENTRYPOINT, instead of CMD, to run etcd without binary path specified
etcd v3.0.1 (2016-07-01)
etcd v3.0.0 (2016-06-30)

View File

@ -37,13 +37,14 @@ See [etcdctl][etcdctl] for a simple command line client.
### Getting etcd
The easiest way to get etcd is to use one of the pre-built release binaries which are available for OSX, Linux, Windows, AppC (ACI), and Docker. Instructions for using these binaries are on the [GitHub releases page][github-release].
The easiest way to get etcd is to use one of the pre-built release binaries which are available for OSX, Linux, Windows, [rkt][rkt], and Docker. Instructions for using these binaries are on the [GitHub releases page][github-release].
For those wanting to try the very latest version, you can [build the latest version of etcd][dl-build] from the `master` branch.
You will first need [*Go*](https://golang.org/) installed on your machine (version 1.6+ is required).
You will first need [*Go*](https://golang.org/) installed on your machine (version 1.7+ is required).
All development occurs on `master`, including new features and bug fixes.
Bug fixes are first targeted at `master` and subsequently ported to release branches, as described in the [branch management][branch-management] guide.
[rkt]: https://github.com/coreos/rkt/releases/
[github-release]: https://github.com/coreos/etcd/releases/
[branch-management]: ./Documentation/branch_management.md
[dl-build]: ./Documentation/dl_build.md#build-the-latest-version
@ -77,7 +78,7 @@ That's it! etcd is now running and serving client requests. For more
The [official etcd ports][iana-ports] are 2379 for client requests, and 2380 for peer communication.
[iana-ports]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=etcd
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
### Running a local etcd cluster
@ -93,6 +94,10 @@ This will bring up 3 etcd members `infra1`, `infra2` and `infra3` and etcd proxy
Every cluster member and proxy accepts key value reads and key value writes.
### Running etcd on Kubernetes
If you want to run etcd cluster on Kubernetes, try [etcd operator](https://github.com/coreos/etcd-operator).
### Next steps
Now it's time to dig into the full etcd API and other guides.
@ -131,4 +136,3 @@ See [reporting bugs](Documentation/reporting_bugs.md) for details about reportin
### License
etcd is under the Apache 2.0 license. See the [LICENSE](LICENSE) file for details.

View File

@ -6,19 +6,18 @@ This document defines a high level roadmap for etcd development.
The dates below should not be considered authoritative, but rather indicative of the projected timeline of the project. The [milestones defined in GitHub](https://github.com/coreos/etcd/milestones) represent the most up-to-date and issue-for-issue plans.
etcd 3.0 is our current stable branch. The roadmap below outlines new features that will be added to etcd, and while subject to change, define what future stable will look like.
etcd 3.1 is our current stable branch. The roadmap below outlines new features that will be added to etcd, and while subject to change, define what future stable will look like.
### etcd 3.1 (2016-Oct)
- Stable L4 gateway
- Experimental support for scalable proxy
- Automatic leadership transfer for the rolling upgrade
- V3 API improvements
- Get previous key-value pair
- Get only keys (ignore values)
- Get only key count
### etcd 3.2 (2017-Feb)
### etcd 3.2 (2017-May)
- Stable scalable proxy
- JWT token based auth
- Proxy-as-client interface passthrough
- Lock service
- Namespacing proxy
- TLS Command Name and JWT token based authentication
- Read-modify-write V3 Put
- Improved watch performance
- ...
- Support non-blocking concurrent read
### etcd 3.3 (?)
- TBD

View File

@ -32,7 +32,9 @@ var _ = math.Inf
// This is a compile-time assertion to ensure that this generated file
// is compatible with the proto package it is being compiled against.
const _ = proto.ProtoPackageIsVersion1
// A compilation error at this line likely means your copy of the
// proto package needs to be updated.
const _ = proto.ProtoPackageIsVersion2 // please upgrade the proto package
type Permission_Type int32
@ -99,113 +101,113 @@ func init() {
proto.RegisterType((*Role)(nil), "authpb.Role")
proto.RegisterEnum("authpb.Permission_Type", Permission_Type_name, Permission_Type_value)
}
func (m *User) Marshal() (data []byte, err error) {
func (m *User) Marshal() (dAtA []byte, err error) {
size := m.Size()
data = make([]byte, size)
n, err := m.MarshalTo(data)
dAtA = make([]byte, size)
n, err := m.MarshalTo(dAtA)
if err != nil {
return nil, err
}
return data[:n], nil
return dAtA[:n], nil
}
func (m *User) MarshalTo(data []byte) (int, error) {
func (m *User) MarshalTo(dAtA []byte) (int, error) {
var i int
_ = i
var l int
_ = l
if len(m.Name) > 0 {
data[i] = 0xa
dAtA[i] = 0xa
i++
i = encodeVarintAuth(data, i, uint64(len(m.Name)))
i += copy(data[i:], m.Name)
i = encodeVarintAuth(dAtA, i, uint64(len(m.Name)))
i += copy(dAtA[i:], m.Name)
}
if len(m.Password) > 0 {
data[i] = 0x12
dAtA[i] = 0x12
i++
i = encodeVarintAuth(data, i, uint64(len(m.Password)))
i += copy(data[i:], m.Password)
i = encodeVarintAuth(dAtA, i, uint64(len(m.Password)))
i += copy(dAtA[i:], m.Password)
}
if len(m.Roles) > 0 {
for _, s := range m.Roles {
data[i] = 0x1a
dAtA[i] = 0x1a
i++
l = len(s)
for l >= 1<<7 {
data[i] = uint8(uint64(l)&0x7f | 0x80)
dAtA[i] = uint8(uint64(l)&0x7f | 0x80)
l >>= 7
i++
}
data[i] = uint8(l)
dAtA[i] = uint8(l)
i++
i += copy(data[i:], s)
i += copy(dAtA[i:], s)
}
}
return i, nil
}
func (m *Permission) Marshal() (data []byte, err error) {
func (m *Permission) Marshal() (dAtA []byte, err error) {
size := m.Size()
data = make([]byte, size)
n, err := m.MarshalTo(data)
dAtA = make([]byte, size)
n, err := m.MarshalTo(dAtA)
if err != nil {
return nil, err
}
return data[:n], nil
return dAtA[:n], nil
}
func (m *Permission) MarshalTo(data []byte) (int, error) {
func (m *Permission) MarshalTo(dAtA []byte) (int, error) {
var i int
_ = i
var l int
_ = l
if m.PermType != 0 {
data[i] = 0x8
dAtA[i] = 0x8
i++
i = encodeVarintAuth(data, i, uint64(m.PermType))
i = encodeVarintAuth(dAtA, i, uint64(m.PermType))
}
if len(m.Key) > 0 {
data[i] = 0x12
dAtA[i] = 0x12
i++
i = encodeVarintAuth(data, i, uint64(len(m.Key)))
i += copy(data[i:], m.Key)
i = encodeVarintAuth(dAtA, i, uint64(len(m.Key)))
i += copy(dAtA[i:], m.Key)
}
if len(m.RangeEnd) > 0 {
data[i] = 0x1a
dAtA[i] = 0x1a
i++
i = encodeVarintAuth(data, i, uint64(len(m.RangeEnd)))
i += copy(data[i:], m.RangeEnd)
i = encodeVarintAuth(dAtA, i, uint64(len(m.RangeEnd)))
i += copy(dAtA[i:], m.RangeEnd)
}
return i, nil
}
func (m *Role) Marshal() (data []byte, err error) {
func (m *Role) Marshal() (dAtA []byte, err error) {
size := m.Size()
data = make([]byte, size)
n, err := m.MarshalTo(data)
dAtA = make([]byte, size)
n, err := m.MarshalTo(dAtA)
if err != nil {
return nil, err
}
return data[:n], nil
return dAtA[:n], nil
}
func (m *Role) MarshalTo(data []byte) (int, error) {
func (m *Role) MarshalTo(dAtA []byte) (int, error) {
var i int
_ = i
var l int
_ = l
if len(m.Name) > 0 {
data[i] = 0xa
dAtA[i] = 0xa
i++
i = encodeVarintAuth(data, i, uint64(len(m.Name)))
i += copy(data[i:], m.Name)
i = encodeVarintAuth(dAtA, i, uint64(len(m.Name)))
i += copy(dAtA[i:], m.Name)
}
if len(m.KeyPermission) > 0 {
for _, msg := range m.KeyPermission {
data[i] = 0x12
dAtA[i] = 0x12
i++
i = encodeVarintAuth(data, i, uint64(msg.Size()))
n, err := msg.MarshalTo(data[i:])
i = encodeVarintAuth(dAtA, i, uint64(msg.Size()))
n, err := msg.MarshalTo(dAtA[i:])
if err != nil {
return 0, err
}
@ -215,31 +217,31 @@ func (m *Role) MarshalTo(data []byte) (int, error) {
return i, nil
}
func encodeFixed64Auth(data []byte, offset int, v uint64) int {
data[offset] = uint8(v)
data[offset+1] = uint8(v >> 8)
data[offset+2] = uint8(v >> 16)
data[offset+3] = uint8(v >> 24)
data[offset+4] = uint8(v >> 32)
data[offset+5] = uint8(v >> 40)
data[offset+6] = uint8(v >> 48)
data[offset+7] = uint8(v >> 56)
func encodeFixed64Auth(dAtA []byte, offset int, v uint64) int {
dAtA[offset] = uint8(v)
dAtA[offset+1] = uint8(v >> 8)
dAtA[offset+2] = uint8(v >> 16)
dAtA[offset+3] = uint8(v >> 24)
dAtA[offset+4] = uint8(v >> 32)
dAtA[offset+5] = uint8(v >> 40)
dAtA[offset+6] = uint8(v >> 48)
dAtA[offset+7] = uint8(v >> 56)
return offset + 8
}
func encodeFixed32Auth(data []byte, offset int, v uint32) int {
data[offset] = uint8(v)
data[offset+1] = uint8(v >> 8)
data[offset+2] = uint8(v >> 16)
data[offset+3] = uint8(v >> 24)
func encodeFixed32Auth(dAtA []byte, offset int, v uint32) int {
dAtA[offset] = uint8(v)
dAtA[offset+1] = uint8(v >> 8)
dAtA[offset+2] = uint8(v >> 16)
dAtA[offset+3] = uint8(v >> 24)
return offset + 4
}
func encodeVarintAuth(data []byte, offset int, v uint64) int {
func encodeVarintAuth(dAtA []byte, offset int, v uint64) int {
for v >= 1<<7 {
data[offset] = uint8(v&0x7f | 0x80)
dAtA[offset] = uint8(v&0x7f | 0x80)
v >>= 7
offset++
}
data[offset] = uint8(v)
dAtA[offset] = uint8(v)
return offset + 1
}
func (m *User) Size() (n int) {
@ -308,8 +310,8 @@ func sovAuth(x uint64) (n int) {
func sozAuth(x uint64) (n int) {
return sovAuth(uint64((x << 1) ^ uint64((int64(x) >> 63))))
}
func (m *User) Unmarshal(data []byte) error {
l := len(data)
func (m *User) Unmarshal(dAtA []byte) error {
l := len(dAtA)
iNdEx := 0
for iNdEx < l {
preIndex := iNdEx
@ -321,7 +323,7 @@ func (m *User) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
wire |= (uint64(b) & 0x7F) << shift
if b < 0x80 {
@ -349,7 +351,7 @@ func (m *User) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
byteLen |= (int(b) & 0x7F) << shift
if b < 0x80 {
@ -363,7 +365,7 @@ func (m *User) Unmarshal(data []byte) error {
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Name = append(m.Name[:0], data[iNdEx:postIndex]...)
m.Name = append(m.Name[:0], dAtA[iNdEx:postIndex]...)
if m.Name == nil {
m.Name = []byte{}
}
@ -380,7 +382,7 @@ func (m *User) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
byteLen |= (int(b) & 0x7F) << shift
if b < 0x80 {
@ -394,7 +396,7 @@ func (m *User) Unmarshal(data []byte) error {
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Password = append(m.Password[:0], data[iNdEx:postIndex]...)
m.Password = append(m.Password[:0], dAtA[iNdEx:postIndex]...)
if m.Password == nil {
m.Password = []byte{}
}
@ -411,7 +413,7 @@ func (m *User) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
stringLen |= (uint64(b) & 0x7F) << shift
if b < 0x80 {
@ -426,11 +428,11 @@ func (m *User) Unmarshal(data []byte) error {
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Roles = append(m.Roles, string(data[iNdEx:postIndex]))
m.Roles = append(m.Roles, string(dAtA[iNdEx:postIndex]))
iNdEx = postIndex
default:
iNdEx = preIndex
skippy, err := skipAuth(data[iNdEx:])
skippy, err := skipAuth(dAtA[iNdEx:])
if err != nil {
return err
}
@ -449,8 +451,8 @@ func (m *User) Unmarshal(data []byte) error {
}
return nil
}
func (m *Permission) Unmarshal(data []byte) error {
l := len(data)
func (m *Permission) Unmarshal(dAtA []byte) error {
l := len(dAtA)
iNdEx := 0
for iNdEx < l {
preIndex := iNdEx
@ -462,7 +464,7 @@ func (m *Permission) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
wire |= (uint64(b) & 0x7F) << shift
if b < 0x80 {
@ -490,7 +492,7 @@ func (m *Permission) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
m.PermType |= (Permission_Type(b) & 0x7F) << shift
if b < 0x80 {
@ -509,7 +511,7 @@ func (m *Permission) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
byteLen |= (int(b) & 0x7F) << shift
if b < 0x80 {
@ -523,7 +525,7 @@ func (m *Permission) Unmarshal(data []byte) error {
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Key = append(m.Key[:0], data[iNdEx:postIndex]...)
m.Key = append(m.Key[:0], dAtA[iNdEx:postIndex]...)
if m.Key == nil {
m.Key = []byte{}
}
@ -540,7 +542,7 @@ func (m *Permission) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
byteLen |= (int(b) & 0x7F) << shift
if b < 0x80 {
@ -554,14 +556,14 @@ func (m *Permission) Unmarshal(data []byte) error {
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.RangeEnd = append(m.RangeEnd[:0], data[iNdEx:postIndex]...)
m.RangeEnd = append(m.RangeEnd[:0], dAtA[iNdEx:postIndex]...)
if m.RangeEnd == nil {
m.RangeEnd = []byte{}
}
iNdEx = postIndex
default:
iNdEx = preIndex
skippy, err := skipAuth(data[iNdEx:])
skippy, err := skipAuth(dAtA[iNdEx:])
if err != nil {
return err
}
@ -580,8 +582,8 @@ func (m *Permission) Unmarshal(data []byte) error {
}
return nil
}
func (m *Role) Unmarshal(data []byte) error {
l := len(data)
func (m *Role) Unmarshal(dAtA []byte) error {
l := len(dAtA)
iNdEx := 0
for iNdEx < l {
preIndex := iNdEx
@ -593,7 +595,7 @@ func (m *Role) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
wire |= (uint64(b) & 0x7F) << shift
if b < 0x80 {
@ -621,7 +623,7 @@ func (m *Role) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
byteLen |= (int(b) & 0x7F) << shift
if b < 0x80 {
@ -635,7 +637,7 @@ func (m *Role) Unmarshal(data []byte) error {
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Name = append(m.Name[:0], data[iNdEx:postIndex]...)
m.Name = append(m.Name[:0], dAtA[iNdEx:postIndex]...)
if m.Name == nil {
m.Name = []byte{}
}
@ -652,7 +654,7 @@ func (m *Role) Unmarshal(data []byte) error {
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
msglen |= (int(b) & 0x7F) << shift
if b < 0x80 {
@ -667,13 +669,13 @@ func (m *Role) Unmarshal(data []byte) error {
return io.ErrUnexpectedEOF
}
m.KeyPermission = append(m.KeyPermission, &Permission{})
if err := m.KeyPermission[len(m.KeyPermission)-1].Unmarshal(data[iNdEx:postIndex]); err != nil {
if err := m.KeyPermission[len(m.KeyPermission)-1].Unmarshal(dAtA[iNdEx:postIndex]); err != nil {
return err
}
iNdEx = postIndex
default:
iNdEx = preIndex
skippy, err := skipAuth(data[iNdEx:])
skippy, err := skipAuth(dAtA[iNdEx:])
if err != nil {
return err
}
@ -692,8 +694,8 @@ func (m *Role) Unmarshal(data []byte) error {
}
return nil
}
func skipAuth(data []byte) (n int, err error) {
l := len(data)
func skipAuth(dAtA []byte) (n int, err error) {
l := len(dAtA)
iNdEx := 0
for iNdEx < l {
var wire uint64
@ -704,7 +706,7 @@ func skipAuth(data []byte) (n int, err error) {
if iNdEx >= l {
return 0, io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
wire |= (uint64(b) & 0x7F) << shift
if b < 0x80 {
@ -722,7 +724,7 @@ func skipAuth(data []byte) (n int, err error) {
return 0, io.ErrUnexpectedEOF
}
iNdEx++
if data[iNdEx-1] < 0x80 {
if dAtA[iNdEx-1] < 0x80 {
break
}
}
@ -739,7 +741,7 @@ func skipAuth(data []byte) (n int, err error) {
if iNdEx >= l {
return 0, io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
length |= (int(b) & 0x7F) << shift
if b < 0x80 {
@ -762,7 +764,7 @@ func skipAuth(data []byte) (n int, err error) {
if iNdEx >= l {
return 0, io.ErrUnexpectedEOF
}
b := data[iNdEx]
b := dAtA[iNdEx]
iNdEx++
innerWire |= (uint64(b) & 0x7F) << shift
if b < 0x80 {
@ -773,7 +775,7 @@ func skipAuth(data []byte) (n int, err error) {
if innerWireType == 4 {
break
}
next, err := skipAuth(data[start:])
next, err := skipAuth(dAtA[start:])
if err != nil {
return 0, err
}
@ -797,6 +799,8 @@ var (
ErrIntOverflowAuth = fmt.Errorf("proto: integer overflow")
)
func init() { proto.RegisterFile("auth.proto", fileDescriptorAuth) }
var fileDescriptorAuth = []byte{
// 288 bytes of a gzipped FileDescriptorProto
0x1f, 0x8b, 0x08, 0x00, 0x00, 0x09, 0x6e, 0x88, 0x02, 0xff, 0x6c, 0x90, 0xc1, 0x4a, 0xc3, 0x30,

View File

@ -49,38 +49,30 @@ func isRangeEqual(a, b *rangePerm) bool {
// removeSubsetRangePerms removes any rangePerms that are subsets of other rangePerms.
// If there are equal ranges, removeSubsetRangePerms only keeps one of them.
func removeSubsetRangePerms(perms []*rangePerm) []*rangePerm {
// TODO(mitake): currently it is O(n^2), we need a better algorithm
var newp []*rangePerm
// It returns a sorted rangePerm slice.
func removeSubsetRangePerms(perms []*rangePerm) (newp []*rangePerm) {
sort.Sort(RangePermSliceByBegin(perms))
var prev *rangePerm
for i := range perms {
skip := false
for j := range perms {
if i == j {
continue
}
if isRangeEqual(perms[i], perms[j]) {
// if ranges are equal, we only keep the first range.
if i > j {
skip = true
break
}
} else if isSubset(perms[i], perms[j]) {
// if a range is a strict subset of the other one, we skip the subset.
skip = true
break
}
}
if skip {
if i == 0 {
prev = perms[i]
newp = append(newp, perms[i])
continue
}
if isRangeEqual(perms[i], prev) {
continue
}
if isSubset(perms[i], prev) {
continue
}
if isSubset(prev, perms[i]) {
prev = perms[i]
newp[len(newp)-1] = perms[i]
continue
}
prev = perms[i]
newp = append(newp, perms[i])
}
return newp
}
@ -88,7 +80,6 @@ func removeSubsetRangePerms(perms []*rangePerm) []*rangePerm {
func mergeRangePerms(perms []*rangePerm) []*rangePerm {
var merged []*rangePerm
perms = removeSubsetRangePerms(perms)
sort.Sort(RangePermSliceByBegin(perms))
i := 0
for i < len(perms) {

View File

@ -16,6 +16,8 @@ package auth
import (
"bytes"
"fmt"
"reflect"
"testing"
)
@ -131,3 +133,47 @@ func TestGetMergedPerms(t *testing.T) {
}
}
}
func TestRemoveSubsetRangePerms(t *testing.T) {
tests := []struct {
perms []*rangePerm
expect []*rangePerm
}{
{ // subsets converge
[]*rangePerm{{[]byte{2}, []byte{3}}, {[]byte{2}, []byte{5}}, {[]byte{1}, []byte{4}}},
[]*rangePerm{{[]byte{1}, []byte{4}}, {[]byte{2}, []byte{5}}},
},
{ // subsets converge
[]*rangePerm{{[]byte{0}, []byte{3}}, {[]byte{0}, []byte{1}}, {[]byte{2}, []byte{4}}, {[]byte{0}, []byte{2}}},
[]*rangePerm{{[]byte{0}, []byte{3}}, {[]byte{2}, []byte{4}}},
},
{ // biggest range at the end
[]*rangePerm{{[]byte{2}, []byte{3}}, {[]byte{0}, []byte{2}}, {[]byte{1}, []byte{4}}, {[]byte{0}, []byte{5}}},
[]*rangePerm{{[]byte{0}, []byte{5}}},
},
{ // biggest range at the beginning
[]*rangePerm{{[]byte{0}, []byte{5}}, {[]byte{2}, []byte{3}}, {[]byte{0}, []byte{2}}, {[]byte{1}, []byte{4}}},
[]*rangePerm{{[]byte{0}, []byte{5}}},
},
{ // no overlapping ranges
[]*rangePerm{{[]byte{2}, []byte{3}}, {[]byte{0}, []byte{1}}, {[]byte{4}, []byte{7}}, {[]byte{8}, []byte{15}}},
[]*rangePerm{{[]byte{0}, []byte{1}}, {[]byte{2}, []byte{3}}, {[]byte{4}, []byte{7}}, {[]byte{8}, []byte{15}}},
},
}
for i, tt := range tests {
rs := removeSubsetRangePerms(tt.perms)
if !reflect.DeepEqual(rs, tt.expect) {
t.Fatalf("#%d: unexpected rangePerms %q, got %q", i, printPerms(rs), printPerms(tt.expect))
}
}
}
func printPerms(rs []*rangePerm) (txt string) {
for i, p := range rs {
if i != 0 {
txt += ","
}
txt += fmt.Sprintf("%+v", *p)
}
return
}

View File

@ -21,6 +21,8 @@ import (
"crypto/rand"
"math/big"
"strings"
"sync"
"time"
)
const (
@ -28,6 +30,83 @@ const (
defaultSimpleTokenLength = 16
)
// var for testing purposes
var (
simpleTokenTTL = 5 * time.Minute
simpleTokenTTLResolution = 1 * time.Second
)
type simpleTokenTTLKeeper struct {
tokens map[string]time.Time
donec chan struct{}
stopc chan struct{}
deleteTokenFunc func(string)
mu *sync.Mutex
}
func (tm *simpleTokenTTLKeeper) stop() {
select {
case tm.stopc <- struct{}{}:
case <-tm.donec:
}
<-tm.donec
}
func (tm *simpleTokenTTLKeeper) addSimpleToken(token string) {
tm.tokens[token] = time.Now().Add(simpleTokenTTL)
}
func (tm *simpleTokenTTLKeeper) resetSimpleToken(token string) {
if _, ok := tm.tokens[token]; ok {
tm.tokens[token] = time.Now().Add(simpleTokenTTL)
}
}
func (tm *simpleTokenTTLKeeper) deleteSimpleToken(token string) {
delete(tm.tokens, token)
}
func (tm *simpleTokenTTLKeeper) run() {
tokenTicker := time.NewTicker(simpleTokenTTLResolution)
defer func() {
tokenTicker.Stop()
close(tm.donec)
}()
for {
select {
case <-tokenTicker.C:
nowtime := time.Now()
tm.mu.Lock()
for t, tokenendtime := range tm.tokens {
if nowtime.After(tokenendtime) {
tm.deleteTokenFunc(t)
delete(tm.tokens, t)
}
}
tm.mu.Unlock()
case <-tm.stopc:
return
}
}
}
func (as *authStore) enable() {
delf := func(tk string) {
if username, ok := as.simpleTokens[tk]; ok {
plog.Infof("deleting token %s for user %s", tk, username)
delete(as.simpleTokens, tk)
}
}
as.simpleTokenKeeper = &simpleTokenTTLKeeper{
tokens: make(map[string]time.Time),
donec: make(chan struct{}),
stopc: make(chan struct{}),
deleteTokenFunc: delf,
mu: &as.simpleTokensMu,
}
go as.simpleTokenKeeper.run()
}
func (as *authStore) GenSimpleToken() (string, error) {
ret := make([]byte, defaultSimpleTokenLength)
@ -45,23 +124,26 @@ func (as *authStore) GenSimpleToken() (string, error) {
func (as *authStore) assignSimpleTokenToUser(username, token string) {
as.simpleTokensMu.Lock()
_, ok := as.simpleTokens[token]
if ok {
plog.Panicf("token %s is alredy used", token)
}
as.simpleTokens[token] = username
as.simpleTokenKeeper.addSimpleToken(token)
as.simpleTokensMu.Unlock()
}
func (as *authStore) invalidateUser(username string) {
if as.simpleTokenKeeper == nil {
return
}
as.simpleTokensMu.Lock()
defer as.simpleTokensMu.Unlock()
for token, name := range as.simpleTokens {
if strings.Compare(name, username) == 0 {
delete(as.simpleTokens, token)
as.simpleTokenKeeper.deleteSimpleToken(token)
}
}
as.simpleTokensMu.Unlock()
}

View File

@ -20,6 +20,7 @@ import (
"errors"
"fmt"
"sort"
"strconv"
"strings"
"sync"
@ -29,6 +30,7 @@ import (
"github.com/coreos/pkg/capnslog"
"golang.org/x/crypto/bcrypt"
"golang.org/x/net/context"
"google.golang.org/grpc/metadata"
)
var (
@ -47,6 +49,7 @@ var (
ErrRootUserNotExist = errors.New("auth: root user does not exist")
ErrRootRoleNotExist = errors.New("auth: root user does not have root role")
ErrUserAlreadyExist = errors.New("auth: user already exists")
ErrUserEmpty = errors.New("auth: user name is empty")
ErrUserNotFound = errors.New("auth: user not found")
ErrRoleAlreadyExist = errors.New("auth: role already exists")
ErrRoleNotFound = errors.New("auth: role not found")
@ -56,6 +59,7 @@ var (
ErrPermissionNotGranted = errors.New("auth: permission is not granted to the role")
ErrAuthNotEnabled = errors.New("auth: authentication is not enabled")
ErrAuthOldRevision = errors.New("auth: revision in header is old")
ErrInvalidAuthToken = errors.New("auth: invalid auth token")
// BcryptCost is the algorithm cost / strength for hashing auth passwords
BcryptCost = bcrypt.DefaultCost
@ -146,6 +150,15 @@ type AuthStore interface {
// Revision gets current revision of authStore
Revision() uint64
// CheckPassword checks a given pair of username and password is correct
CheckPassword(username, password string) (uint64, error)
// Close does cleanup of AuthStore
Close() error
// AuthInfoFromCtx gets AuthInfo from gRPC's context
AuthInfoFromCtx(ctx context.Context) (*AuthInfo, error)
}
type authStore struct {
@ -155,13 +168,33 @@ type authStore struct {
rangePermCache map[string]*unifiedRangePermissions // username -> unifiedRangePermissions
simpleTokensMu sync.RWMutex
simpleTokens map[string]string // token -> username
revision uint64
// tokenSimple in v3.2+
indexWaiter func(uint64) <-chan struct{}
simpleTokenKeeper *simpleTokenTTLKeeper
simpleTokensMu sync.Mutex
simpleTokens map[string]string // token -> username
}
func newDeleterFunc(as *authStore) func(string) {
return func(t string) {
as.simpleTokensMu.Lock()
defer as.simpleTokensMu.Unlock()
if username, ok := as.simpleTokens[t]; ok {
plog.Infof("deleting token %s for user %s", t, username)
delete(as.simpleTokens, t)
}
}
}
func (as *authStore) AuthEnable() error {
as.enabledMu.Lock()
defer as.enabledMu.Unlock()
if as.enabled {
plog.Noticef("Authentication already enabled")
return nil
}
b := as.be
tx := b.BatchTx()
tx.Lock()
@ -181,9 +214,8 @@ func (as *authStore) AuthEnable() error {
tx.UnsafePut(authBucketName, enableFlagKey, authEnabled)
as.enabledMu.Lock()
as.enabled = true
as.enabledMu.Unlock()
as.enable()
as.rangePermCache = make(map[string]*unifiedRangePermissions)
@ -195,6 +227,11 @@ func (as *authStore) AuthEnable() error {
}
func (as *authStore) AuthDisable() {
as.enabledMu.Lock()
defer as.enabledMu.Unlock()
if !as.enabled {
return
}
b := as.be
tx := b.BatchTx()
tx.Lock()
@ -203,17 +240,33 @@ func (as *authStore) AuthDisable() {
tx.Unlock()
b.ForceCommit()
as.enabledMu.Lock()
as.enabled = false
as.enabledMu.Unlock()
as.simpleTokensMu.Lock()
tk := as.simpleTokenKeeper
as.simpleTokenKeeper = nil
as.simpleTokens = make(map[string]string) // invalidate all tokens
as.simpleTokensMu.Unlock()
if tk != nil {
tk.stop()
}
plog.Noticef("Authentication disabled")
}
func (as *authStore) Close() error {
as.enabledMu.Lock()
defer as.enabledMu.Unlock()
if !as.enabled {
return nil
}
if as.simpleTokenKeeper != nil {
as.simpleTokenKeeper.stop()
as.simpleTokenKeeper = nil
}
return nil
}
func (as *authStore) Authenticate(ctx context.Context, username, password string) (*pb.AuthenticateResponse, error) {
if !as.isAuthEnabled() {
return nil, ErrAuthNotEnabled
@ -232,11 +285,6 @@ func (as *authStore) Authenticate(ctx context.Context, username, password string
return nil, ErrAuthFailed
}
if bcrypt.CompareHashAndPassword(user.Password, []byte(password)) != nil {
plog.Noticef("authentication failed, invalid password for user %s", username)
return &pb.AuthenticateResponse{}, ErrAuthFailed
}
token := fmt.Sprintf("%s.%d", simpleToken, index)
as.assignSimpleTokenToUser(username, token)
@ -244,6 +292,24 @@ func (as *authStore) Authenticate(ctx context.Context, username, password string
return &pb.AuthenticateResponse{Token: token}, nil
}
func (as *authStore) CheckPassword(username, password string) (uint64, error) {
tx := as.be.BatchTx()
tx.Lock()
defer tx.Unlock()
user := getUser(tx, username)
if user == nil {
return 0, ErrAuthFailed
}
if bcrypt.CompareHashAndPassword(user.Password, []byte(password)) != nil {
plog.Noticef("authentication failed, invalid password for user %s", username)
return 0, ErrAuthFailed
}
return getRevision(tx), nil
}
func (as *authStore) Recover(be backend.Backend) {
enabled := false
as.be = be
@ -266,6 +332,10 @@ func (as *authStore) Recover(be backend.Backend) {
}
func (as *authStore) UserAdd(r *pb.AuthUserAddRequest) (*pb.AuthUserAddResponse, error) {
if len(r.Name) == 0 {
return nil, ErrUserEmpty
}
hashed, err := bcrypt.GenerateFromPassword([]byte(r.Password), BcryptCost)
if err != nil {
plog.Errorf("failed to hash password: %s", err)
@ -400,11 +470,7 @@ func (as *authStore) UserGet(r *pb.AuthUserGetRequest) (*pb.AuthUserGetResponse,
if user == nil {
return nil, ErrUserNotFound
}
for _, role := range user.Roles {
resp.Roles = append(resp.Roles, role)
}
resp.Roles = append(resp.Roles, user.Roles...)
return &resp, nil
}
@ -470,11 +536,7 @@ func (as *authStore) RoleGet(r *pb.AuthRoleGetRequest) (*pb.AuthRoleGetResponse,
if role == nil {
return nil, ErrRoleNotFound
}
for _, perm := range role.KeyPermission {
resp.Perm = append(resp.Perm, perm)
}
resp.Perm = append(resp.Perm, role.KeyPermission...)
return &resp, nil
}
@ -584,10 +646,14 @@ func (as *authStore) RoleAdd(r *pb.AuthRoleAddRequest) (*pb.AuthRoleAddResponse,
}
func (as *authStore) AuthInfoFromToken(token string) (*AuthInfo, bool) {
as.simpleTokensMu.RLock()
defer as.simpleTokensMu.RUnlock()
t, ok := as.simpleTokens[token]
return &AuthInfo{Username: t, Revision: as.revision}, ok
// same as '(t *tokenSimple) info' in v3.2+
as.simpleTokensMu.Lock()
username, ok := as.simpleTokens[token]
if ok && as.simpleTokenKeeper != nil {
as.simpleTokenKeeper.resetSimpleToken(token)
}
as.simpleTokensMu.Unlock()
return &AuthInfo{Username: username, Revision: as.revision}, ok
}
type permSlice []*authpb.Permission
@ -652,6 +718,11 @@ func (as *authStore) isOpPermitted(userName string, revision uint64, key, rangeE
return nil
}
// only gets rev == 0 when passed AuthInfo{}; no user given
if revision == 0 {
return ErrUserEmpty
}
if revision < as.revision {
return ErrAuthOldRevision
}
@ -694,6 +765,9 @@ func (as *authStore) IsAdminPermitted(authInfo *AuthInfo) error {
if !as.isAuthEnabled() {
return nil
}
if authInfo == nil {
return ErrUserEmpty
}
tx := as.be.BatchTx()
tx.Lock()
@ -812,7 +886,7 @@ func (as *authStore) isAuthEnabled() bool {
return as.enabled
}
func NewAuthStore(be backend.Backend) *authStore {
func NewAuthStore(be backend.Backend, indexWaiter func(uint64) <-chan struct{}) *authStore {
tx := be.BatchTx()
tx.Lock()
@ -820,13 +894,30 @@ func NewAuthStore(be backend.Backend) *authStore {
tx.UnsafeCreateBucket(authUsersBucketName)
tx.UnsafeCreateBucket(authRolesBucketName)
as := &authStore{
be: be,
simpleTokens: make(map[string]string),
revision: 0,
enabled := false
_, vs := tx.UnsafeRange(authBucketName, enableFlagKey, nil, 0)
if len(vs) == 1 {
if bytes.Equal(vs[0], authEnabled) {
enabled = true
}
}
as.commitRevision(tx)
as := &authStore{
be: be,
simpleTokens: make(map[string]string),
revision: getRevision(tx),
indexWaiter: indexWaiter,
enabled: enabled,
rangePermCache: make(map[string]*unifiedRangePermissions),
}
if enabled {
as.enable()
}
if as.revision == 0 {
as.commitRevision(tx)
}
tx.Unlock()
be.ForceCommit()
@ -853,7 +944,8 @@ func (as *authStore) commitRevision(tx backend.BatchTx) {
func getRevision(tx backend.BatchTx) uint64 {
_, vs := tx.UnsafeRange(authBucketName, []byte(revisionKey), nil, 0)
if len(vs) != 1 {
plog.Panicf("failed to get the key of auth store revision")
// this can happen in the initialization phase
return 0
}
return binary.BigEndian.Uint64(vs[0])
@ -862,3 +954,46 @@ func getRevision(tx backend.BatchTx) uint64 {
func (as *authStore) Revision() uint64 {
return as.revision
}
func (as *authStore) isValidSimpleToken(token string, ctx context.Context) bool {
splitted := strings.Split(token, ".")
if len(splitted) != 2 {
return false
}
index, err := strconv.Atoi(splitted[1])
if err != nil {
return false
}
select {
case <-as.indexWaiter(uint64(index)):
return true
case <-ctx.Done():
}
return false
}
func (as *authStore) AuthInfoFromCtx(ctx context.Context) (*AuthInfo, error) {
md, ok := metadata.FromContext(ctx)
if !ok {
return nil, nil
}
ts, tok := md["token"]
if !tok {
return nil, nil
}
token := ts[0]
if !as.isValidSimpleToken(token, ctx) {
return nil, ErrInvalidAuthToken
}
authInfo, uok := as.AuthInfoFromToken(token)
if !uok {
plog.Warningf("invalid auth token: %s", token)
return nil, ErrInvalidAuthToken
}
return authInfo, nil
}

View File

@ -26,25 +26,38 @@ import (
func init() { BcryptCost = bcrypt.MinCost }
func TestUserAdd(t *testing.T) {
b, tPath := backend.NewDefaultTmpBackend()
defer func() {
b.Close()
os.Remove(tPath)
func dummyIndexWaiter(index uint64) <-chan struct{} {
ch := make(chan struct{})
go func() {
ch <- struct{}{}
}()
return ch
}
as := NewAuthStore(b)
ua := &pb.AuthUserAddRequest{Name: "foo"}
_, err := as.UserAdd(ua) // add a non-existing user
// TestNewAuthStoreRevision ensures newly auth store
// keeps the old revision when there are no changes.
func TestNewAuthStoreRevision(t *testing.T) {
b, tPath := backend.NewDefaultTmpBackend()
defer os.Remove(tPath)
as := NewAuthStore(b, dummyIndexWaiter)
err := enableAuthAndCreateRoot(as)
if err != nil {
t.Fatal(err)
}
_, err = as.UserAdd(ua) // add an existing user
if err == nil {
t.Fatalf("expected %v, got %v", ErrUserAlreadyExist, err)
}
if err != ErrUserAlreadyExist {
t.Fatalf("expected %v, got %v", ErrUserAlreadyExist, err)
old := as.Revision()
b.Close()
as.Close()
// no changes to commit
b2 := backend.NewDefaultBackend(tPath)
as = NewAuthStore(b2, dummyIndexWaiter)
new := as.Revision()
b2.Close()
as.Close()
if old != new {
t.Fatalf("expected revision %d, got %d", old, new)
}
}
@ -67,14 +80,15 @@ func enableAuthAndCreateRoot(as *authStore) error {
return as.AuthEnable()
}
func TestAuthenticate(t *testing.T) {
func TestCheckPassword(t *testing.T) {
b, tPath := backend.NewDefaultTmpBackend()
defer func() {
b.Close()
os.Remove(tPath)
}()
as := NewAuthStore(b)
as := NewAuthStore(b, dummyIndexWaiter)
defer as.Close()
err := enableAuthAndCreateRoot(as)
if err != nil {
t.Fatal(err)
@ -87,8 +101,7 @@ func TestAuthenticate(t *testing.T) {
}
// auth a non-existing user
ctx1 := context.WithValue(context.WithValue(context.TODO(), "index", uint64(1)), "simpleToken", "dummy")
_, err = as.Authenticate(ctx1, "foo-test", "bar")
_, err = as.CheckPassword("foo-test", "bar")
if err == nil {
t.Fatalf("expected %v, got %v", ErrAuthFailed, err)
}
@ -97,15 +110,13 @@ func TestAuthenticate(t *testing.T) {
}
// auth an existing user with correct password
ctx2 := context.WithValue(context.WithValue(context.TODO(), "index", uint64(2)), "simpleToken", "dummy")
_, err = as.Authenticate(ctx2, "foo", "bar")
_, err = as.CheckPassword("foo", "bar")
if err != nil {
t.Fatal(err)
}
// auth an existing user but with wrong password
ctx3 := context.WithValue(context.WithValue(context.TODO(), "index", uint64(3)), "simpleToken", "dummy")
_, err = as.Authenticate(ctx3, "foo", "")
_, err = as.CheckPassword("foo", "")
if err == nil {
t.Fatalf("expected %v, got %v", ErrAuthFailed, err)
}
@ -121,7 +132,8 @@ func TestUserDelete(t *testing.T) {
os.Remove(tPath)
}()
as := NewAuthStore(b)
as := NewAuthStore(b, dummyIndexWaiter)
defer as.Close()
err := enableAuthAndCreateRoot(as)
if err != nil {
t.Fatal(err)
@ -157,7 +169,8 @@ func TestUserChangePassword(t *testing.T) {
os.Remove(tPath)
}()
as := NewAuthStore(b)
as := NewAuthStore(b, dummyIndexWaiter)
defer as.Close()
err := enableAuthAndCreateRoot(as)
if err != nil {
t.Fatal(err)
@ -202,7 +215,8 @@ func TestRoleAdd(t *testing.T) {
os.Remove(tPath)
}()
as := NewAuthStore(b)
as := NewAuthStore(b, dummyIndexWaiter)
defer as.Close()
err := enableAuthAndCreateRoot(as)
if err != nil {
t.Fatal(err)
@ -222,7 +236,8 @@ func TestUserGrant(t *testing.T) {
os.Remove(tPath)
}()
as := NewAuthStore(b)
as := NewAuthStore(b, dummyIndexWaiter)
defer as.Close()
err := enableAuthAndCreateRoot(as)
if err != nil {
t.Fatal(err)
@ -253,4 +268,93 @@ func TestUserGrant(t *testing.T) {
if err != ErrUserNotFound {
t.Fatalf("expected %v, got %v", ErrUserNotFound, err)
}
// non-admin user
err = as.IsAdminPermitted(&AuthInfo{Username: "foo", Revision: 1})
if err != ErrPermissionDenied {
t.Errorf("expected %v, got %v", ErrPermissionDenied, err)
}
// disabled auth should return nil
as.AuthDisable()
err = as.IsAdminPermitted(&AuthInfo{Username: "root", Revision: 1})
if err != nil {
t.Errorf("expected nil, got %v", err)
}
}
func TestRecoverFromSnapshot(t *testing.T) {
as, _ := setupAuthStore(t)
ua := &pb.AuthUserAddRequest{Name: "foo"}
_, err := as.UserAdd(ua) // add an existing user
if err == nil {
t.Fatalf("expected %v, got %v", ErrUserAlreadyExist, err)
}
if err != ErrUserAlreadyExist {
t.Fatalf("expected %v, got %v", ErrUserAlreadyExist, err)
}
ua = &pb.AuthUserAddRequest{Name: ""}
_, err = as.UserAdd(ua) // add a user with empty name
if err != ErrUserEmpty {
t.Fatal(err)
}
as.Close()
as2 := NewAuthStore(as.be, dummyIndexWaiter)
defer func(a *authStore) {
a.Close()
}(as2)
if !as2.isAuthEnabled() {
t.Fatal("recovering authStore from existing backend failed")
}
ul, err := as.UserList(&pb.AuthUserListRequest{})
if err != nil {
t.Fatal(err)
}
if !contains(ul.Users, "root") {
t.Errorf("expected %v in %v", "root", ul.Users)
}
}
func contains(array []string, str string) bool {
for _, s := range array {
if s == str {
return true
}
}
return false
}
func setupAuthStore(t *testing.T) (store *authStore, teardownfunc func(t *testing.T)) {
b, tPath := backend.NewDefaultTmpBackend()
as := NewAuthStore(b, dummyIndexWaiter)
err := enableAuthAndCreateRoot(as)
if err != nil {
t.Fatal(err)
}
// adds a new role
_, err = as.RoleAdd(&pb.AuthRoleAddRequest{Name: "role-test"})
if err != nil {
t.Fatal(err)
}
ua := &pb.AuthUserAddRequest{Name: "foo", Password: "bar"}
_, err = as.UserAdd(ua) // add a non-existing user
if err != nil {
t.Fatal(err)
}
tearDown := func(t *testing.T) {
b.Close()
os.Remove(tPath)
as.Close()
}
return as, tearDown
}

5
build
View File

@ -11,8 +11,7 @@ if [ ! -z "$FAILPOINTS" ]; then
GIT_SHA="$GIT_SHA"-FAILPOINTS
fi
# Set GO_LDFLAGS="" for building with all symbols for debugging.
if [ -z "${GO_LDFLAGS+x}" ]; then GO_LDFLAGS="-s"; fi
# Set GO_LDFLAGS="-s" for building without symbols for debugging.
GO_LDFLAGS="$GO_LDFLAGS -X ${REPO_PATH}/cmd/vendor/${REPO_PATH}/version.GitSHA=${GIT_SHA}"
# enable/disable failpoints
@ -49,7 +48,7 @@ etcd_setup_gopath() {
GOPATH=":$GOPATH"
fi
export GOPATH=${etcdGOPATH}$GOPATH
rm -f ${etcdGOPATH}/src
rm -rf ${etcdGOPATH}/src
mkdir -p ${etcdGOPATH}
ln -s ${CDIR}/cmd/vendor ${etcdGOPATH}/src
}

View File

@ -1,16 +1,18 @@
$ORG_PATH="github.com/coreos"
$REPO_PATH="$ORG_PATH/etcd"
$PWD = $((Get-Item -Path ".\" -Verbose).FullName)
$GO_LDFLAGS="-s"
$FSROOT = $((Get-Location).Drive.Name+":")
$FSYS = $((Get-WMIObject win32_logicaldisk -filter "DeviceID = '$FSROOT'").filesystem)
# Set $Env:GO_LDFLAGS=" "(space) for building with all symbols for debugging.
if ($Env:GO_LDFLAGS.length -gt 0) {
$GO_LDFLAGS=$Env:GO_LDFLAGS
if ($FSYS.StartsWith("FAT","CurrentCultureIgnoreCase")) {
echo "Error: Cannot build etcd using the $FSYS filesystem (use NTFS instead)"
exit 1
}
$GO_LDFLAGS="$GO_LDFLAGS -X $REPO_PATH/cmd/vendor/$REPO_PATH/version.GitSHA=$GIT_SHA"
# Set $Env:GO_LDFLAGS="-s" for building without symbols.
$GO_LDFLAGS="$Env:GO_LDFLAGS -X $REPO_PATH/cmd/vendor/$REPO_PATH/version.GitSHA=$GIT_SHA"
# rebuild symlinks
echo "Rebuilding symlinks"
git ls-files -s cmd | select-string -pattern 120000 | ForEach {
$l = $_.ToString()
$lnkname = $l.Split(' ')[1]
@ -20,27 +22,54 @@ git ls-files -s cmd | select-string -pattern 120000 | ForEach {
$terms = $lnkname.Split("\")
$dirname = $terms[0..($terms.length-2)] -join "\"
$lnkname = "$PWD\$lnkname"
$targetAbs = "$((Get-Item -Path "$dirname\$target").FullName)"
$targetAbs = $targetAbs.Replace("/", "\")
if (test-path -pathtype container "$targetAbs") {
# rd so deleting junction doesn't take files with it
cmd /c rd "$lnkname"
cmd /c del /A /F "$lnkname"
cmd /c mklink /J "$lnkname" "$targetAbs"
if (Test-Path "$lnkname") {
if ((Get-Item "$lnkname") -is [System.IO.DirectoryInfo]) {
# rd so deleting junction doesn't take files with it
cmd /c rd "$lnkname"
}
}
if (Test-Path "$lnkname") {
if (!((Get-Item "$lnkname") -is [System.IO.DirectoryInfo])) {
cmd /c del /A /F "$lnkname"
}
}
cmd /c mklink /J "$lnkname" "$targetAbs" ">NUL"
} else {
cmd /c del /A /F "$lnkname"
cmd /c mklink /H "$lnkname" "$targetAbs"
# Remove file with symlink data (first run)
if (Test-Path "$lnkname") {
cmd /c del /A /F "$lnkname"
}
cmd /c mklink /H "$lnkname" "$targetAbs" ">NUL"
}
}
if (-not $env:GOPATH) {
$orgpath="$PWD\gopath\src\" + $ORG_PATH.Replace("/", "\")
cmd /c rd "$orgpath\etcd"
cmd /c del "$orgpath"
cmd /c mkdir "$orgpath"
cmd /c mklink /J "$orgpath\etcd" "$PWD"
if (Test-Path "$orgpath\etcd") {
if ((Get-Item "$orgpath\etcd") -is [System.IO.DirectoryInfo]) {
# rd so deleting junction doesn't take files with it
cmd /c rd "$orgpath\etcd"
}
}
if (Test-Path "$orgpath") {
if ((Get-Item "$orgpath") -is [System.IO.DirectoryInfo]) {
# rd so deleting junction doesn't take files with it
cmd /c rd "$orgpath"
}
}
if (Test-Path "$orgpath") {
if (!((Get-Item "$orgpath") -is [System.IO.DirectoryInfo])) {
# Remove file with symlink data (first run)
cmd /c del /A /F "$orgpath"
}
}
cmd /c mkdir "$orgpath"
cmd /c mklink /J "$orgpath\etcd" "$PWD" ">NUL"
$env:GOPATH = "$PWD\gopath"
}

View File

@ -114,4 +114,4 @@ if err != nil {
3. Default etcd/client cannot handle the case that the remote server is SIGSTOPed now. TCP keepalive mechanism doesn't help in this scenario because operating system may still send TCP keep-alive packets. Over time we'd like to improve this functionality, but solving this issue isn't high priority because a real-life case in which a server is stopped, but the connection is kept alive, hasn't been brought to our attention.
4. etcd/client cannot detect whether the member in use is healthy when doing read requests. If the member is isolated from the cluster, etcd/client may retrieve outdated data. As a workaround, users could monitor experimental /health endpoint for member healthy information. We are improving it at [#3265](https://github.com/coreos/etcd/issues/3265).
4. etcd/client cannot detect whether a member is healthy with watches and non-quorum read requests. If the member is isolated from the cluster, etcd/client may retrieve outdated data. Instead, users can either issue quorum read requests or monitor the /health endpoint for member health information.

View File

@ -22,7 +22,6 @@ import (
"net"
"net/http"
"net/url"
"reflect"
"sort"
"strconv"
"sync"
@ -261,53 +260,67 @@ type httpClusterClient struct {
selectionMode EndpointSelectionMode
}
func (c *httpClusterClient) getLeaderEndpoint() (string, error) {
mAPI := NewMembersAPI(c)
leader, err := mAPI.Leader(context.Background())
func (c *httpClusterClient) getLeaderEndpoint(ctx context.Context, eps []url.URL) (string, error) {
ceps := make([]url.URL, len(eps))
copy(ceps, eps)
// To perform a lookup on the new endpoint list without using the current
// client, we'll copy it
clientCopy := &httpClusterClient{
clientFactory: c.clientFactory,
credentials: c.credentials,
rand: c.rand,
pinned: 0,
endpoints: ceps,
}
mAPI := NewMembersAPI(clientCopy)
leader, err := mAPI.Leader(ctx)
if err != nil {
return "", err
}
if len(leader.ClientURLs) == 0 {
return "", ErrNoLeaderEndpoint
}
return leader.ClientURLs[0], nil // TODO: how to handle multiple client URLs?
}
func (c *httpClusterClient) SetEndpoints(eps []string) error {
func (c *httpClusterClient) parseEndpoints(eps []string) ([]url.URL, error) {
if len(eps) == 0 {
return ErrNoEndpoints
return []url.URL{}, ErrNoEndpoints
}
neps := make([]url.URL, len(eps))
for i, ep := range eps {
u, err := url.Parse(ep)
if err != nil {
return err
return []url.URL{}, err
}
neps[i] = *u
}
return neps, nil
}
switch c.selectionMode {
case EndpointSelectionRandom:
c.endpoints = shuffleEndpoints(c.rand, neps)
c.pinned = 0
case EndpointSelectionPrioritizeLeader:
c.endpoints = neps
lep, err := c.getLeaderEndpoint()
if err != nil {
return ErrNoLeaderEndpoint
}
for i := range c.endpoints {
if c.endpoints[i].String() == lep {
c.pinned = i
break
}
}
// If endpoints doesn't have the lu, just keep c.pinned = 0.
// Forwarding between follower and leader would be required but it works.
default:
return fmt.Errorf("invalid endpoint selection mode: %d", c.selectionMode)
func (c *httpClusterClient) SetEndpoints(eps []string) error {
neps, err := c.parseEndpoints(eps)
if err != nil {
return err
}
c.Lock()
defer c.Unlock()
c.endpoints = shuffleEndpoints(c.rand, neps)
// We're not doing anything for PrioritizeLeader here. This is
// due to not having a context meaning we can't call getLeaderEndpoint
// However, if you're using PrioritizeLeader, you've already been told
// to regularly call sync, where we do have a ctx, and can figure the
// leader. PrioritizeLeader is also quite a loose guarantee, so deal
// with it
c.pinned = 0
return nil
}
@ -401,27 +414,51 @@ func (c *httpClusterClient) Sync(ctx context.Context) error {
return err
}
c.Lock()
defer c.Unlock()
var eps []string
for _, m := range ms {
eps = append(eps, m.ClientURLs...)
}
sort.Sort(sort.StringSlice(eps))
ceps := make([]string, len(c.endpoints))
for i, cep := range c.endpoints {
ceps[i] = cep.String()
}
sort.Sort(sort.StringSlice(ceps))
// fast path if no change happens
// this helps client to pin the endpoint when no cluster change
if reflect.DeepEqual(eps, ceps) {
return nil
neps, err := c.parseEndpoints(eps)
if err != nil {
return err
}
return c.SetEndpoints(eps)
npin := 0
switch c.selectionMode {
case EndpointSelectionRandom:
c.RLock()
eq := endpointsEqual(c.endpoints, neps)
c.RUnlock()
if eq {
return nil
}
// When items in the endpoint list changes, we choose a new pin
neps = shuffleEndpoints(c.rand, neps)
case EndpointSelectionPrioritizeLeader:
nle, err := c.getLeaderEndpoint(ctx, neps)
if err != nil {
return ErrNoLeaderEndpoint
}
for i, n := range neps {
if n.String() == nle {
npin = i
break
}
}
default:
return fmt.Errorf("invalid endpoint selection mode: %d", c.selectionMode)
}
c.Lock()
defer c.Unlock()
c.endpoints = neps
c.pinned = npin
return nil
}
func (c *httpClusterClient) AutoSync(ctx context.Context, interval time.Duration) error {
@ -607,3 +644,27 @@ func shuffleEndpoints(r *rand.Rand, eps []url.URL) []url.URL {
}
return neps
}
func endpointsEqual(left, right []url.URL) bool {
if len(left) != len(right) {
return false
}
sLeft := make([]string, len(left))
sRight := make([]string, len(right))
for i, l := range left {
sLeft[i] = l.String()
}
for i, r := range right {
sRight[i] = r.String()
}
sort.Strings(sLeft)
sort.Strings(sRight)
for i := range sLeft {
if sLeft[i] != sRight[i] {
return false
}
}
return true
}

View File

@ -900,6 +900,90 @@ func TestHTTPClusterClientSyncPinEndpoint(t *testing.T) {
}
}
// TestHTTPClusterClientSyncUnpinEndpoint tests that Sync() unpins the endpoint when
// it gets a different member list than before.
func TestHTTPClusterClientSyncUnpinEndpoint(t *testing.T) {
cf := newStaticHTTPClientFactory([]staticHTTPResponse{
{
resp: http.Response{StatusCode: http.StatusOK, Header: http.Header{"Content-Type": []string{"application/json"}}},
body: []byte(`{"members":[{"id":"2745e2525fce8fe","peerURLs":["http://127.0.0.1:7003"],"name":"node3","clientURLs":["http://127.0.0.1:4003"]},{"id":"42134f434382925","peerURLs":["http://127.0.0.1:2380","http://127.0.0.1:7001"],"name":"node1","clientURLs":["http://127.0.0.1:2379","http://127.0.0.1:4001"]},{"id":"94088180e21eb87b","peerURLs":["http://127.0.0.1:7002"],"name":"node2","clientURLs":["http://127.0.0.1:4002"]}]}`),
},
{
resp: http.Response{StatusCode: http.StatusOK, Header: http.Header{"Content-Type": []string{"application/json"}}},
body: []byte(`{"members":[{"id":"42134f434382925","peerURLs":["http://127.0.0.1:2380","http://127.0.0.1:7001"],"name":"node1","clientURLs":["http://127.0.0.1:2379","http://127.0.0.1:4001"]},{"id":"94088180e21eb87b","peerURLs":["http://127.0.0.1:7002"],"name":"node2","clientURLs":["http://127.0.0.1:4002"]}]}`),
},
{
resp: http.Response{StatusCode: http.StatusOK, Header: http.Header{"Content-Type": []string{"application/json"}}},
body: []byte(`{"members":[{"id":"2745e2525fce8fe","peerURLs":["http://127.0.0.1:7003"],"name":"node3","clientURLs":["http://127.0.0.1:4003"]},{"id":"42134f434382925","peerURLs":["http://127.0.0.1:2380","http://127.0.0.1:7001"],"name":"node1","clientURLs":["http://127.0.0.1:2379","http://127.0.0.1:4001"]},{"id":"94088180e21eb87b","peerURLs":["http://127.0.0.1:7002"],"name":"node2","clientURLs":["http://127.0.0.1:4002"]}]}`),
},
})
hc := &httpClusterClient{
clientFactory: cf,
rand: rand.New(rand.NewSource(0)),
}
err := hc.SetEndpoints([]string{"http://127.0.0.1:4003", "http://127.0.0.1:2379", "http://127.0.0.1:4001", "http://127.0.0.1:4002"})
if err != nil {
t.Fatalf("unexpected error during setup: %#v", err)
}
wants := []string{"http://127.0.0.1:2379", "http://127.0.0.1:4001", "http://127.0.0.1:4002"}
for i := 0; i < 3; i++ {
err = hc.Sync(context.Background())
if err != nil {
t.Fatalf("#%d: unexpected error during Sync: %#v", i, err)
}
if g := hc.endpoints[hc.pinned]; g.String() != wants[i] {
t.Errorf("#%d: pinned endpoint = %v, want %v", i, g, wants[i])
}
}
}
// TestHTTPClusterClientSyncPinLeaderEndpoint tests that Sync() pins the leader
// when the selection mode is EndpointSelectionPrioritizeLeader
func TestHTTPClusterClientSyncPinLeaderEndpoint(t *testing.T) {
cf := newStaticHTTPClientFactory([]staticHTTPResponse{
{
resp: http.Response{StatusCode: http.StatusOK, Header: http.Header{"Content-Type": []string{"application/json"}}},
body: []byte(`{"members":[{"id":"2745e2525fce8fe","peerURLs":["http://127.0.0.1:7003"],"name":"node3","clientURLs":["http://127.0.0.1:4003"]},{"id":"42134f434382925","peerURLs":["http://127.0.0.1:2380","http://127.0.0.1:7001"],"name":"node1","clientURLs":["http://127.0.0.1:2379","http://127.0.0.1:4001"]},{"id":"94088180e21eb87b","peerURLs":["http://127.0.0.1:7002"],"name":"node2","clientURLs":["http://127.0.0.1:4002"]}]}`),
},
{
resp: http.Response{StatusCode: http.StatusOK, Header: http.Header{"Content-Type": []string{"application/json"}}},
body: []byte(`{"id":"2745e2525fce8fe","peerURLs":["http://127.0.0.1:7003"],"name":"node3","clientURLs":["http://127.0.0.1:4003"]}`),
},
{
resp: http.Response{StatusCode: http.StatusOK, Header: http.Header{"Content-Type": []string{"application/json"}}},
body: []byte(`{"members":[{"id":"2745e2525fce8fe","peerURLs":["http://127.0.0.1:7003"],"name":"node3","clientURLs":["http://127.0.0.1:4003"]},{"id":"42134f434382925","peerURLs":["http://127.0.0.1:2380","http://127.0.0.1:7001"],"name":"node1","clientURLs":["http://127.0.0.1:2379","http://127.0.0.1:4001"]},{"id":"94088180e21eb87b","peerURLs":["http://127.0.0.1:7002"],"name":"node2","clientURLs":["http://127.0.0.1:4002"]}]}`),
},
{
resp: http.Response{StatusCode: http.StatusOK, Header: http.Header{"Content-Type": []string{"application/json"}}},
body: []byte(`{"id":"94088180e21eb87b","peerURLs":["http://127.0.0.1:7002"],"name":"node2","clientURLs":["http://127.0.0.1:4002"]}`),
},
})
hc := &httpClusterClient{
clientFactory: cf,
rand: rand.New(rand.NewSource(0)),
selectionMode: EndpointSelectionPrioritizeLeader,
endpoints: []url.URL{{}}, // Need somewhere to pretend to send to initially
}
wants := []string{"http://127.0.0.1:4003", "http://127.0.0.1:4002"}
for i, want := range wants {
err := hc.Sync(context.Background())
if err != nil {
t.Fatalf("#%d: unexpected error during Sync: %#v", i, err)
}
pinned := hc.endpoints[hc.pinned].String()
if pinned != want {
t.Errorf("#%d: pinned endpoint = %v, want %v", i, pinned, want)
}
}
}
func TestHTTPClusterClientResetFail(t *testing.T) {
tests := [][]string{
// need at least one endpoint

View File

@ -34,7 +34,7 @@ import (
func TestV2NoRetryEOF(t *testing.T) {
defer testutil.AfterTest(t)
// generate an EOF response; specify address so appears first in sorted ep list
lEOF := integration.NewListenerWithAddr(t, fmt.Sprintf("eof:123.%d.sock", os.Getpid()))
lEOF := integration.NewListenerWithAddr(t, fmt.Sprintf("127.0.0.1:%05d", os.Getpid()))
defer lEOF.Close()
tries := uint32(0)
go func() {
@ -65,8 +65,7 @@ func TestV2NoRetryEOF(t *testing.T) {
// TestV2NoRetryNoLeader tests destructive api calls won't retry if given an error code.
func TestV2NoRetryNoLeader(t *testing.T) {
defer testutil.AfterTest(t)
lHttp := integration.NewListenerWithAddr(t, fmt.Sprintf("errHttp:123.%d.sock", os.Getpid()))
lHttp := integration.NewListenerWithAddr(t, fmt.Sprintf("127.0.0.1:%05d", os.Getpid()))
eh := &errHandler{errCode: http.StatusServiceUnavailable}
srv := httptest.NewUnstartedServer(eh)
defer lHttp.Close()

File diff suppressed because it is too large Load Diff

View File

@ -272,6 +272,10 @@ type Response struct {
// Index holds the cluster-level index at the time the Response was generated.
// This index is not tied to the Node(s) contained in this Response.
Index uint64 `json:"-"`
// ClusterID holds the cluster-level ID reported by the server. This
// should be different for different etcd clusters.
ClusterID string `json:"-"`
}
type Node struct {
@ -665,6 +669,7 @@ func unmarshalSuccessfulKeysResponse(header http.Header, body []byte) (*Response
return nil, err
}
}
res.ClusterID = header.Get("X-Etcd-Cluster-ID")
return &res, nil
}

View File

@ -673,23 +673,24 @@ func TestUnmarshalSuccessfulResponse(t *testing.T) {
expiration.UnmarshalText([]byte("2015-04-07T04:40:23.044979686Z"))
tests := []struct {
hdr string
body string
wantRes *Response
wantErr bool
indexHdr string
clusterIDHdr string
body string
wantRes *Response
wantErr bool
}{
// Neither PrevNode or Node
{
hdr: "1",
body: `{"action":"delete"}`,
wantRes: &Response{Action: "delete", Index: 1},
wantErr: false,
indexHdr: "1",
body: `{"action":"delete"}`,
wantRes: &Response{Action: "delete", Index: 1},
wantErr: false,
},
// PrevNode
{
hdr: "15",
body: `{"action":"delete", "prevNode": {"key": "/foo", "value": "bar", "modifiedIndex": 12, "createdIndex": 10}}`,
indexHdr: "15",
body: `{"action":"delete", "prevNode": {"key": "/foo", "value": "bar", "modifiedIndex": 12, "createdIndex": 10}}`,
wantRes: &Response{
Action: "delete",
Index: 15,
@ -706,8 +707,8 @@ func TestUnmarshalSuccessfulResponse(t *testing.T) {
// Node
{
hdr: "15",
body: `{"action":"get", "node": {"key": "/foo", "value": "bar", "modifiedIndex": 12, "createdIndex": 10, "ttl": 10, "expiration": "2015-04-07T04:40:23.044979686Z"}}`,
indexHdr: "15",
body: `{"action":"get", "node": {"key": "/foo", "value": "bar", "modifiedIndex": 12, "createdIndex": 10, "ttl": 10, "expiration": "2015-04-07T04:40:23.044979686Z"}}`,
wantRes: &Response{
Action: "get",
Index: 15,
@ -726,8 +727,9 @@ func TestUnmarshalSuccessfulResponse(t *testing.T) {
// Node Dir
{
hdr: "15",
body: `{"action":"get", "node": {"key": "/foo", "dir": true, "modifiedIndex": 12, "createdIndex": 10}}`,
indexHdr: "15",
clusterIDHdr: "abcdef",
body: `{"action":"get", "node": {"key": "/foo", "dir": true, "modifiedIndex": 12, "createdIndex": 10}}`,
wantRes: &Response{
Action: "get",
Index: 15,
@ -737,15 +739,16 @@ func TestUnmarshalSuccessfulResponse(t *testing.T) {
ModifiedIndex: 12,
CreatedIndex: 10,
},
PrevNode: nil,
PrevNode: nil,
ClusterID: "abcdef",
},
wantErr: false,
},
// PrevNode and Node
{
hdr: "15",
body: `{"action":"update", "prevNode": {"key": "/foo", "value": "baz", "modifiedIndex": 10, "createdIndex": 10}, "node": {"key": "/foo", "value": "bar", "modifiedIndex": 12, "createdIndex": 10}}`,
indexHdr: "15",
body: `{"action":"update", "prevNode": {"key": "/foo", "value": "baz", "modifiedIndex": 10, "createdIndex": 10}, "node": {"key": "/foo", "value": "bar", "modifiedIndex": 12, "createdIndex": 10}}`,
wantRes: &Response{
Action: "update",
Index: 15,
@ -767,24 +770,24 @@ func TestUnmarshalSuccessfulResponse(t *testing.T) {
// Garbage in body
{
hdr: "",
body: `garbage`,
wantRes: nil,
wantErr: true,
indexHdr: "",
body: `garbage`,
wantRes: nil,
wantErr: true,
},
// non-integer index
{
hdr: "poo",
body: `{}`,
wantRes: nil,
wantErr: true,
indexHdr: "poo",
body: `{}`,
wantRes: nil,
wantErr: true,
},
}
for i, tt := range tests {
h := make(http.Header)
h.Add("X-Etcd-Index", tt.hdr)
h.Add("X-Etcd-Index", tt.indexHdr)
res, err := unmarshalSuccessfulKeysResponse(h, []byte(tt.body))
if tt.wantErr != (err != nil) {
t.Errorf("#%d: wantErr=%t, err=%v", i, tt.wantErr, err)

View File

@ -72,6 +72,10 @@ if err != nil {
}
```
## Metrics
The etcd client optionally exposes RPC metrics through [go-grpc-prometheus](https://github.com/grpc-ecosystem/go-grpc-prometheus). See the [examples](https://github.com/coreos/etcd/blob/master/clientv3/example_metrics_test.go).
## Examples
More code examples can be found at [GoDoc](https://godoc.org/github.com/coreos/etcd/clientv3).

View File

@ -116,12 +116,12 @@ func NewAuth(c *Client) Auth {
}
func (auth *auth) AuthEnable(ctx context.Context) (*AuthEnableResponse, error) {
resp, err := auth.remote.AuthEnable(ctx, &pb.AuthEnableRequest{})
resp, err := auth.remote.AuthEnable(ctx, &pb.AuthEnableRequest{}, grpc.FailFast(false))
return (*AuthEnableResponse)(resp), toErr(ctx, err)
}
func (auth *auth) AuthDisable(ctx context.Context) (*AuthDisableResponse, error) {
resp, err := auth.remote.AuthDisable(ctx, &pb.AuthDisableRequest{})
resp, err := auth.remote.AuthDisable(ctx, &pb.AuthDisableRequest{}, grpc.FailFast(false))
return (*AuthDisableResponse)(resp), toErr(ctx, err)
}

View File

@ -21,8 +21,14 @@ import (
"golang.org/x/net/context"
"google.golang.org/grpc"
"google.golang.org/grpc/codes"
)
// ErrNoAddrAvilable is returned by Get() when the balancer does not have
// any active connection to endpoints at the time.
// This error is returned only when opts.BlockingWait is true.
var ErrNoAddrAvilable = grpc.Errorf(codes.Unavailable, "there is no address available")
// simpleBalancer does the bare minimum to expose multiple eps
// to the grpc reconnection code path
type simpleBalancer struct {
@ -162,6 +168,25 @@ func (b *simpleBalancer) Up(addr grpc.Address) func(error) {
func (b *simpleBalancer) Get(ctx context.Context, opts grpc.BalancerGetOptions) (grpc.Address, func(), error) {
var addr string
// If opts.BlockingWait is false (for fail-fast RPCs), it should return
// an address it has notified via Notify immediately instead of blocking.
if !opts.BlockingWait {
b.mu.RLock()
closed := b.closed
addr = b.pinAddr
upEps := len(b.upEps)
b.mu.RUnlock()
if closed {
return grpc.Address{Addr: ""}, nil, grpc.ErrClientConnClosing
}
if upEps == 0 {
return grpc.Address{Addr: ""}, nil, ErrNoAddrAvilable
}
return grpc.Address{Addr: addr}, func() {}, nil
}
for {
b.mu.RLock()
ch := b.upc

106
clientv3/balancer_test.go Normal file
View File

@ -0,0 +1,106 @@
// Copyright 2016 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package clientv3
import (
"errors"
"testing"
"time"
"golang.org/x/net/context"
"google.golang.org/grpc"
)
var (
endpoints = []string{"localhost:2379", "localhost:22379", "localhost:32379"}
)
func TestBalancerGetUnblocking(t *testing.T) {
sb := newSimpleBalancer(endpoints)
unblockingOpts := grpc.BalancerGetOptions{BlockingWait: false}
_, _, err := sb.Get(context.Background(), unblockingOpts)
if err != ErrNoAddrAvilable {
t.Errorf("Get() with no up endpoints should return ErrNoAddrAvailable, got: %v", err)
}
down1 := sb.Up(grpc.Address{Addr: endpoints[1]})
down2 := sb.Up(grpc.Address{Addr: endpoints[2]})
addrFirst, putFun, err := sb.Get(context.Background(), unblockingOpts)
if err != nil {
t.Errorf("Get() with up endpoints should success, got %v", err)
}
if addrFirst.Addr != endpoints[1] {
t.Errorf("Get() didn't return expected address, got %v", addrFirst)
}
if putFun == nil {
t.Errorf("Get() returned unexpected nil put function")
}
addrSecond, _, _ := sb.Get(context.Background(), unblockingOpts)
if addrFirst.Addr != addrSecond.Addr {
t.Errorf("Get() didn't return the same address as previous call, got %v and %v", addrFirst, addrSecond)
}
down1(errors.New("error"))
down2(errors.New("error"))
_, _, err = sb.Get(context.Background(), unblockingOpts)
if err != ErrNoAddrAvilable {
t.Errorf("Get() with no up endpoints should return ErrNoAddrAvailable, got: %v", err)
}
}
func TestBalancerGetBlocking(t *testing.T) {
sb := newSimpleBalancer(endpoints)
blockingOpts := grpc.BalancerGetOptions{BlockingWait: true}
ctx, _ := context.WithTimeout(context.Background(), time.Millisecond*100)
_, _, err := sb.Get(ctx, blockingOpts)
if err != context.DeadlineExceeded {
t.Errorf("Get() with no up endpoints should timeout, got %v", err)
}
downC := make(chan func(error), 1)
go func() {
// ensure sb.Up() will be called after sb.Get() to see if Up() releases blocking Get()
time.Sleep(time.Millisecond * 100)
downC <- sb.Up(grpc.Address{Addr: endpoints[1]})
}()
addrFirst, putFun, err := sb.Get(context.Background(), blockingOpts)
if err != nil {
t.Errorf("Get() with up endpoints should success, got %v", err)
}
if addrFirst.Addr != endpoints[1] {
t.Errorf("Get() didn't return expected address, got %v", addrFirst)
}
if putFun == nil {
t.Errorf("Get() returned unexpected nil put function")
}
down1 := <-downC
down2 := sb.Up(grpc.Address{Addr: endpoints[2]})
addrSecond, _, _ := sb.Get(context.Background(), blockingOpts)
if addrFirst.Addr != addrSecond.Addr {
t.Errorf("Get() didn't return the same address as previous call, got %v and %v", addrFirst, addrSecond)
}
down1(errors.New("error"))
down2(errors.New("error"))
ctx, _ = context.WithTimeout(context.Background(), time.Millisecond*100)
_, _, err = sb.Get(ctx, blockingOpts)
if err != context.DeadlineExceeded {
t.Errorf("Get() with no up endpoints should timeout, got %v", err)
}
}

View File

@ -21,9 +21,11 @@ import (
"net"
"net/url"
"strings"
"sync"
"time"
"github.com/coreos/etcd/etcdserver/api/v3rpc/rpctypes"
prometheus "github.com/grpc-ecosystem/go-grpc-prometheus"
"golang.org/x/net/context"
"google.golang.org/grpc"
@ -45,11 +47,12 @@ type Client struct {
Auth
Maintenance
conn *grpc.ClientConn
cfg Config
creds *credentials.TransportCredentials
balancer *simpleBalancer
retryWrapper retryRpcFunc
conn *grpc.ClientConn
cfg Config
creds *credentials.TransportCredentials
balancer *simpleBalancer
retryWrapper retryRpcFunc
retryAuthWrapper retryRpcFunc
ctx context.Context
cancel context.CancelFunc
@ -58,6 +61,8 @@ type Client struct {
Username string
// Password is a password for authentication
Password string
// tokenCred is an instance of WithPerRPCCredentials()'s argument
tokenCred *authTokenCredential
}
// New creates a new etcdv3 client from a given configuration.
@ -86,6 +91,8 @@ func NewFromConfigFile(path string) (*Client, error) {
// Close shuts down the client's etcd connections.
func (c *Client) Close() error {
c.cancel()
c.Watcher.Close()
c.Lease.Close()
return toErr(c.ctx, c.conn.Close())
}
@ -95,7 +102,12 @@ func (c *Client) Close() error {
func (c *Client) Ctx() context.Context { return c.ctx }
// Endpoints lists the registered endpoints for the client.
func (c *Client) Endpoints() []string { return c.cfg.Endpoints }
func (c *Client) Endpoints() (eps []string) {
// copy the slice; protect original endpoints from being changed
eps = make([]string, len(c.cfg.Endpoints))
copy(eps, c.cfg.Endpoints)
return
}
// SetEndpoints updates client's endpoints.
func (c *Client) SetEndpoints(eps ...string) {
@ -136,7 +148,8 @@ func (c *Client) autoSync() {
}
type authTokenCredential struct {
token string
token string
tokenMu *sync.RWMutex
}
func (cred authTokenCredential) RequireTransportSecurity() bool {
@ -144,6 +157,8 @@ func (cred authTokenCredential) RequireTransportSecurity() bool {
}
func (cred authTokenCredential) GetRequestMetadata(ctx context.Context, s ...string) (map[string]string, error) {
cred.tokenMu.RLock()
defer cred.tokenMu.RUnlock()
return map[string]string{
"token": cred.token,
}, nil
@ -206,7 +221,8 @@ func (c *Client) dialSetupOpts(endpoint string, dopts ...grpc.DialOption) (opts
return nil, c.ctx.Err()
default:
}
return net.DialTimeout(proto, host, t)
dialer := &net.Dialer{Timeout: t}
return dialer.DialContext(c.ctx, proto, host)
}
opts = append(opts, grpc.WithDialer(f))
@ -228,24 +244,64 @@ func (c *Client) Dial(endpoint string) (*grpc.ClientConn, error) {
return c.dial(endpoint)
}
func (c *Client) getToken(ctx context.Context) error {
var err error // return last error in a case of fail
var auth *authenticator
for i := 0; i < len(c.cfg.Endpoints); i++ {
endpoint := c.cfg.Endpoints[i]
host := getHost(endpoint)
// use dial options without dopts to avoid reusing the client balancer
auth, err = newAuthenticator(host, c.dialSetupOpts(endpoint))
if err != nil {
continue
}
defer auth.close()
var resp *AuthenticateResponse
resp, err = auth.authenticate(ctx, c.Username, c.Password)
if err != nil {
continue
}
c.tokenCred.tokenMu.Lock()
c.tokenCred.token = resp.Token
c.tokenCred.tokenMu.Unlock()
return nil
}
return err
}
func (c *Client) dial(endpoint string, dopts ...grpc.DialOption) (*grpc.ClientConn, error) {
opts := c.dialSetupOpts(endpoint, dopts...)
host := getHost(endpoint)
if c.Username != "" && c.Password != "" {
// use dial options without dopts to avoid reusing the client balancer
auth, err := newAuthenticator(host, c.dialSetupOpts(endpoint))
if err != nil {
return nil, err
c.tokenCred = &authTokenCredential{
tokenMu: &sync.RWMutex{},
}
defer auth.close()
resp, err := auth.authenticate(c.ctx, c.Username, c.Password)
if err != nil {
ctx := c.ctx
if c.cfg.DialTimeout > 0 {
cctx, cancel := context.WithTimeout(ctx, c.cfg.DialTimeout)
defer cancel()
ctx = cctx
}
if err := c.getToken(ctx); err != nil {
if err == ctx.Err() && ctx.Err() != c.ctx.Err() {
err = grpc.ErrClientConnTimeout
}
return nil, err
}
opts = append(opts, grpc.WithPerRPCCredentials(authTokenCredential{token: resp.Token}))
opts = append(opts, grpc.WithPerRPCCredentials(c.tokenCred))
}
// add metrics options
opts = append(opts, grpc.WithUnaryInterceptor(prometheus.UnaryClientInterceptor))
opts = append(opts, grpc.WithStreamInterceptor(prometheus.StreamClientInterceptor))
conn, err := grpc.Dial(host, opts...)
if err != nil {
return nil, err
@ -287,10 +343,13 @@ func newClient(cfg *Config) (*Client, error) {
client.balancer = newSimpleBalancer(cfg.Endpoints)
conn, err := client.dial(cfg.Endpoints[0], grpc.WithBalancer(client.balancer))
if err != nil {
client.cancel()
client.balancer.Close()
return nil, err
}
client.conn = conn
client.retryWrapper = client.newRetryWrapper()
client.retryAuthWrapper = client.newAuthRetryWrapper()
// wait for a connection
if cfg.DialTimeout > 0 {
@ -304,6 +363,7 @@ func newClient(cfg *Config) (*Client, error) {
}
if !hasConn {
client.cancel()
client.balancer.Close()
conn.Close()
return nil, grpc.ErrClientConnTimeout
}

View File

@ -16,6 +16,7 @@ package clientv3
import (
"fmt"
"net"
"testing"
"time"
@ -25,36 +26,89 @@ import (
"google.golang.org/grpc"
)
func TestDialTimeout(t *testing.T) {
func TestDialCancel(t *testing.T) {
defer testutil.AfterTest(t)
donec := make(chan error)
go func() {
// without timeout, grpc keeps redialing if connection refused
cfg := Config{
Endpoints: []string{"localhost:12345"},
DialTimeout: 2 * time.Second}
c, err := New(cfg)
if c != nil || err == nil {
t.Errorf("new client should fail")
}
donec <- err
}()
time.Sleep(10 * time.Millisecond)
select {
case err := <-donec:
t.Errorf("dial didn't wait (%v)", err)
default:
// accept first connection so client is created with dial timeout
ln, err := net.Listen("unix", "dialcancel:12345")
if err != nil {
t.Fatal(err)
}
defer ln.Close()
ep := "unix://dialcancel:12345"
cfg := Config{
Endpoints: []string{ep},
DialTimeout: 30 * time.Second}
c, err := New(cfg)
if err != nil {
t.Fatal(err)
}
// connect to ipv4 blackhole so dial blocks
c.SetEndpoints("http://254.0.0.1:12345")
// issue Get to force redial attempts
go c.Get(context.TODO(), "abc")
// wait a little bit so client close is after dial starts
time.Sleep(100 * time.Millisecond)
donec := make(chan struct{})
go func() {
defer close(donec)
c.Close()
}()
select {
case <-time.After(5 * time.Second):
t.Errorf("failed to timeout dial on time")
case err := <-donec:
if err != grpc.ErrClientConnTimeout {
t.Errorf("unexpected error %v, want %v", err, grpc.ErrClientConnTimeout)
t.Fatalf("failed to close")
case <-donec:
}
}
func TestDialTimeout(t *testing.T) {
defer testutil.AfterTest(t)
testCfgs := []Config{
{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: 2 * time.Second,
},
{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: time.Second,
Username: "abc",
Password: "def",
},
}
for i, cfg := range testCfgs {
donec := make(chan error)
go func() {
// without timeout, dial continues forever on ipv4 blackhole
c, err := New(cfg)
if c != nil || err == nil {
t.Errorf("#%d: new client should fail", i)
}
donec <- err
}()
time.Sleep(10 * time.Millisecond)
select {
case err := <-donec:
t.Errorf("#%d: dial didn't wait (%v)", i, err)
default:
}
select {
case <-time.After(5 * time.Second):
t.Errorf("#%d: failed to timeout dial on time", i)
case err := <-donec:
if err != grpc.ErrClientConnTimeout {
t.Errorf("#%d: unexpected error %v, want %v", i, err, grpc.ErrClientConnTimeout)
}
}
}
}

View File

@ -78,7 +78,7 @@ func (c *cluster) MemberUpdate(ctx context.Context, id uint64, peerAddrs []strin
// it is safe to retry on update.
for {
r := &pb.MemberUpdateRequest{ID: id, PeerURLs: peerAddrs}
resp, err := c.remote.MemberUpdate(ctx, r)
resp, err := c.remote.MemberUpdate(ctx, r, grpc.FailFast(false))
if err == nil {
return (*MemberUpdateResponse)(resp), nil
}

View File

@ -36,6 +36,8 @@ func Compare(cmp Cmp, result string, v interface{}) Cmp {
switch result {
case "=":
r = pb.Compare_EQUAL
case "!=":
r = pb.Compare_NOT_EQUAL
case ">":
r = pb.Compare_GREATER
case "<":

View File

@ -15,6 +15,8 @@
package concurrency
import (
"time"
v3 "github.com/coreos/etcd/clientv3"
"golang.org/x/net/context"
)
@ -25,6 +27,7 @@ const defaultSessionTTL = 60
// Fault-tolerant applications may use sessions to reason about liveness.
type Session struct {
client *v3.Client
opts *sessionOptions
id v3.LeaseID
cancel context.CancelFunc
@ -33,25 +36,25 @@ type Session struct {
// NewSession gets the leased session for a client.
func NewSession(client *v3.Client, opts ...SessionOption) (*Session, error) {
ops := &sessionOptions{ttl: defaultSessionTTL}
ops := &sessionOptions{ttl: defaultSessionTTL, ctx: client.Ctx()}
for _, opt := range opts {
opt(ops)
}
resp, err := client.Grant(client.Ctx(), int64(ops.ttl))
resp, err := client.Grant(ops.ctx, int64(ops.ttl))
if err != nil {
return nil, err
}
id := v3.LeaseID(resp.ID)
ctx, cancel := context.WithCancel(client.Ctx())
ctx, cancel := context.WithCancel(ops.ctx)
keepAlive, err := client.KeepAlive(ctx, id)
if err != nil || keepAlive == nil {
return nil, err
}
donec := make(chan struct{})
s := &Session{client: client, id: id, cancel: cancel, donec: donec}
s := &Session{client: client, opts: ops, id: id, cancel: cancel, donec: donec}
// keep the lease alive until client error or cancelled context
go func() {
@ -87,12 +90,16 @@ func (s *Session) Orphan() {
// Close orphans the session and revokes the session lease.
func (s *Session) Close() error {
s.Orphan()
_, err := s.client.Revoke(s.client.Ctx(), s.id)
// if revoke takes longer than the ttl, lease is expired anyway
ctx, cancel := context.WithTimeout(s.opts.ctx, time.Duration(s.opts.ttl)*time.Second)
_, err := s.client.Revoke(ctx, s.id)
cancel()
return err
}
type sessionOptions struct {
ttl int
ctx context.Context
}
// SessionOption configures Session.
@ -107,3 +114,14 @@ func WithTTL(ttl int) SessionOption {
}
}
}
// WithContext assigns a context to the session instead of defaulting to
// using the client context. This is useful for canceling NewSession and
// Close operations immediately without having to close the client. If the
// context is canceled before Close() completes, the session's lease will be
// abandoned and left to expire instead of being revoked.
func WithContext(ctx context.Context) SessionOption {
return func(so *sessionOptions) {
so.ctx = ctx
}
}

View File

@ -249,11 +249,10 @@ func (s *stmReadCommitted) commit() *v3.TxnResponse {
}
func isKeyCurrent(k string, r *v3.GetResponse) v3.Cmp {
rev := r.Header.Revision + 1
if len(r.Kvs) != 0 {
rev = r.Kvs[0].ModRevision + 1
return v3.Compare(v3.ModRevision(k), "=", r.Kvs[0].ModRevision)
}
return v3.Compare(v3.ModRevision(k), "<", rev)
return v3.Compare(v3.ModRevision(k), "=", 0)
}
func respToValue(resp *v3.GetResponse) string {

View File

@ -44,7 +44,7 @@
// etcd client returns 2 types of errors:
//
// 1. context error: canceled or deadline exceeded.
// 2. gRPC error: see https://github.com/coreos/etcd/blob/master/etcdserver/api/v3rpc/error.go.
// 2. gRPC error: see https://github.com/coreos/etcd/blob/master/etcdserver/api/v3rpc/rpctypes/error.go
//
// Here is the example code to handle client errors:
//

View File

@ -0,0 +1,46 @@
// Copyright 2016 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package clientv3_test
import (
"fmt"
"io/ioutil"
"log"
"net/http"
"github.com/prometheus/client_golang/prometheus"
)
func ExampleMetrics_All() {
// listen for all prometheus metrics
go func() {
http.Handle("/metrics", prometheus.Handler())
log.Fatal(http.ListenAndServe(":47989", nil))
}()
url := "http://localhost:47989/metrics"
// make an http request to fetch all prometheus metrics
resp, err := http.Get(url)
if err != nil {
log.Fatalf("fetch error: %v", err)
}
b, err := ioutil.ReadAll(resp.Body)
resp.Body.Close()
if err != nil {
log.Fatalf("fetch error: reading %s: %v", url, err)
}
fmt.Printf("%s", b)
}

View File

@ -226,6 +226,21 @@ func TestKVRange(t *testing.T) {
{Key: []byte("fop"), Value: nil, CreateRevision: 9, ModRevision: 9, Version: 1},
},
},
// range all with SortByKey, missing sorting order (ASCEND by default)
{
"a", "x",
0,
[]clientv3.OpOption{clientv3.WithSort(clientv3.SortByKey, clientv3.SortNone)},
[]*mvccpb.KeyValue{
{Key: []byte("a"), Value: nil, CreateRevision: 2, ModRevision: 2, Version: 1},
{Key: []byte("b"), Value: nil, CreateRevision: 3, ModRevision: 3, Version: 1},
{Key: []byte("c"), Value: nil, CreateRevision: 4, ModRevision: 6, Version: 3},
{Key: []byte("foo"), Value: nil, CreateRevision: 7, ModRevision: 7, Version: 1},
{Key: []byte("foo/abc"), Value: nil, CreateRevision: 8, ModRevision: 8, Version: 1},
{Key: []byte("fop"), Value: nil, CreateRevision: 9, ModRevision: 9, Version: 1},
},
},
// range all with SortByCreateRevision, SortDescend
{
"a", "x",
@ -241,6 +256,21 @@ func TestKVRange(t *testing.T) {
{Key: []byte("a"), Value: nil, CreateRevision: 2, ModRevision: 2, Version: 1},
},
},
// range all with SortByCreateRevision, missing sorting order (ASCEND by default)
{
"a", "x",
0,
[]clientv3.OpOption{clientv3.WithSort(clientv3.SortByCreateRevision, clientv3.SortNone)},
[]*mvccpb.KeyValue{
{Key: []byte("a"), Value: nil, CreateRevision: 2, ModRevision: 2, Version: 1},
{Key: []byte("b"), Value: nil, CreateRevision: 3, ModRevision: 3, Version: 1},
{Key: []byte("c"), Value: nil, CreateRevision: 4, ModRevision: 6, Version: 3},
{Key: []byte("foo"), Value: nil, CreateRevision: 7, ModRevision: 7, Version: 1},
{Key: []byte("foo/abc"), Value: nil, CreateRevision: 8, ModRevision: 8, Version: 1},
{Key: []byte("fop"), Value: nil, CreateRevision: 9, ModRevision: 9, Version: 1},
},
},
// range all with SortByModRevision, SortDescend
{
"a", "x",

View File

@ -17,10 +17,12 @@ package integration
import (
"reflect"
"sort"
"sync"
"testing"
"time"
"github.com/coreos/etcd/clientv3"
"github.com/coreos/etcd/clientv3/concurrency"
"github.com/coreos/etcd/etcdserver/api/v3rpc/rpctypes"
"github.com/coreos/etcd/integration"
"github.com/coreos/etcd/pkg/testutil"
@ -154,6 +156,30 @@ func TestLeaseKeepAlive(t *testing.T) {
}
}
func TestLeaseKeepAliveOneSecond(t *testing.T) {
defer testutil.AfterTest(t)
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
defer clus.Terminate(t)
cli := clus.Client(0)
resp, err := cli.Grant(context.Background(), 1)
if err != nil {
t.Errorf("failed to create lease %v", err)
}
rc, kerr := cli.KeepAlive(context.Background(), resp.ID)
if kerr != nil {
t.Errorf("failed to keepalive lease %v", kerr)
}
for i := 0; i < 3; i++ {
if _, ok := <-rc; !ok {
t.Errorf("chan is closed, want not closed")
}
}
}
// TODO: add a client that can connect to all the members of cluster via unix sock.
// TODO: test handle more complicated failures.
func TestLeaseKeepAliveHandleFailure(t *testing.T) {
@ -510,3 +536,121 @@ func TestLeaseTimeToLive(t *testing.T) {
t.Fatalf("unexpected keys %+v", lresp.Keys)
}
}
// TestLeaseRenewLostQuorum ensures keepalives work after losing quorum
// for a while.
func TestLeaseRenewLostQuorum(t *testing.T) {
defer testutil.AfterTest(t)
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 3})
defer clus.Terminate(t)
cli := clus.Client(0)
r, err := cli.Grant(context.TODO(), 4)
if err != nil {
t.Fatal(err)
}
kctx, kcancel := context.WithCancel(context.Background())
defer kcancel()
ka, err := cli.KeepAlive(kctx, r.ID)
if err != nil {
t.Fatal(err)
}
// consume first keepalive so next message sends when cluster is down
<-ka
// force keepalive stream message to timeout
clus.Members[1].Stop(t)
clus.Members[2].Stop(t)
// Use TTL-1 since the client closes the keepalive channel if no
// keepalive arrives before the lease deadline.
// The cluster has 1 second to recover and reply to the keepalive.
time.Sleep(time.Duration(r.TTL-1) * time.Second)
clus.Members[1].Restart(t)
clus.Members[2].Restart(t)
select {
case _, ok := <-ka:
if !ok {
t.Fatalf("keepalive closed")
}
case <-time.After(time.Duration(r.TTL) * time.Second):
t.Fatalf("timed out waiting for keepalive")
}
}
func TestLeaseKeepAliveLoopExit(t *testing.T) {
defer testutil.AfterTest(t)
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
defer clus.Terminate(t)
ctx := context.Background()
cli := clus.Client(0)
resp, err := cli.Grant(ctx, 5)
if err != nil {
t.Fatal(err)
}
cli.Lease.Close()
_, err = cli.KeepAlive(ctx, resp.ID)
if _, ok := err.(clientv3.ErrKeepAliveHalted); !ok {
t.Fatalf("expected %T, got %v(%T)", clientv3.ErrKeepAliveHalted{}, err, err)
}
}
// TestV3LeaseFailureOverlap issues Grant and Keepalive requests to a cluster
// before, during, and after quorum loss to confirm Grant/Keepalive tolerates
// transient cluster failure.
func TestV3LeaseFailureOverlap(t *testing.T) {
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 2})
defer clus.Terminate(t)
numReqs := 5
cli := clus.Client(0)
// bring up a session, tear it down
updown := func(i int) error {
sess, err := concurrency.NewSession(cli)
if err != nil {
return err
}
ch := make(chan struct{})
go func() {
defer close(ch)
sess.Close()
}()
select {
case <-ch:
case <-time.After(time.Minute / 4):
t.Fatalf("timeout %d", i)
}
return nil
}
var wg sync.WaitGroup
mkReqs := func(n int) {
wg.Add(numReqs)
for i := 0; i < numReqs; i++ {
go func() {
defer wg.Done()
err := updown(n)
if err == nil || err == rpctypes.ErrTimeoutDueToConnectionLost {
return
}
t.Fatal(err)
}()
}
}
mkReqs(1)
clus.Members[1].Stop(t)
mkReqs(2)
time.Sleep(time.Second)
mkReqs(3)
clus.Members[1].Restart(t)
mkReqs(4)
wg.Wait()
}

View File

@ -0,0 +1,169 @@
// Copyright 2016 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package integration
import (
"bufio"
"io"
"net"
"net/http"
"strconv"
"strings"
"testing"
"time"
"github.com/coreos/etcd/clientv3"
"github.com/coreos/etcd/integration"
"github.com/coreos/etcd/pkg/testutil"
"github.com/coreos/etcd/pkg/transport"
"github.com/prometheus/client_golang/prometheus"
"golang.org/x/net/context"
)
func TestV3ClientMetrics(t *testing.T) {
defer testutil.AfterTest(t)
var (
addr string = "localhost:27989"
ln net.Listener
err error
)
// listen for all prometheus metrics
donec := make(chan struct{})
go func() {
defer close(donec)
srv := &http.Server{Handler: prometheus.Handler()}
srv.SetKeepAlivesEnabled(false)
ln, err = transport.NewUnixListener(addr)
if err != nil {
t.Fatalf("Error: %v occurred while listening on addr: %v", err, addr)
}
err = srv.Serve(ln)
if err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
t.Fatalf("Err serving http requests: %v", err)
}
}()
url := "unix://" + addr + "/metrics"
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
defer clus.Terminate(t)
client := clus.Client(0)
w := clientv3.NewWatcher(client)
defer w.Close()
kv := clientv3.NewKV(client)
wc := w.Watch(context.Background(), "foo")
wBefore := sumCountersForMetricAndLabels(t, url, "grpc_client_msg_received_total", "Watch", "bidi_stream")
pBefore := sumCountersForMetricAndLabels(t, url, "grpc_client_started_total", "Put", "unary")
_, err = kv.Put(context.Background(), "foo", "bar")
if err != nil {
t.Errorf("Error putting value in key store")
}
pAfter := sumCountersForMetricAndLabels(t, url, "grpc_client_started_total", "Put", "unary")
if pBefore+1 != pAfter {
t.Errorf("grpc_client_started_total expected %d, got %d", 1, pAfter-pBefore)
}
// consume watch response
select {
case <-wc:
case <-time.After(10 * time.Second):
t.Error("Timeout occurred for getting watch response")
}
wAfter := sumCountersForMetricAndLabels(t, url, "grpc_client_msg_received_total", "Watch", "bidi_stream")
if wBefore+1 != wAfter {
t.Errorf("grpc_client_msg_received_total expected %d, got %d", 1, wAfter-wBefore)
}
ln.Close()
<-donec
}
func sumCountersForMetricAndLabels(t *testing.T, url string, metricName string, matchingLabelValues ...string) int {
count := 0
for _, line := range getHTTPBodyAsLines(t, url) {
ok := true
if !strings.HasPrefix(line, metricName) {
continue
}
for _, labelValue := range matchingLabelValues {
if !strings.Contains(line, `"`+labelValue+`"`) {
ok = false
break
}
}
if !ok {
continue
}
valueString := line[strings.LastIndex(line, " ")+1 : len(line)-1]
valueFloat, err := strconv.ParseFloat(valueString, 32)
if err != nil {
t.Fatalf("failed parsing value for line: %v and matchingLabelValues: %v", line, matchingLabelValues)
}
count += int(valueFloat)
}
return count
}
func getHTTPBodyAsLines(t *testing.T, url string) []string {
cfgtls := transport.TLSInfo{}
tr, err := transport.NewTransport(cfgtls, time.Second)
if err != nil {
t.Fatalf("Error getting transport: %v", err)
}
tr.MaxIdleConns = -1
tr.DisableKeepAlives = true
cli := &http.Client{Transport: tr}
resp, err := cli.Get(url)
if err != nil {
t.Fatalf("Error fetching: %v", err)
}
reader := bufio.NewReader(resp.Body)
lines := []string{}
for {
line, err := reader.ReadString('\n')
if err != nil {
if err == io.EOF {
break
} else {
t.Fatalf("error reading: %v", err)
}
}
lines = append(lines, line)
}
resp.Body.Close()
return lines
}

View File

@ -347,8 +347,61 @@ func putAndWatch(t *testing.T, wctx *watchctx, key, val string) {
}
}
// TestWatchResumeComapcted checks that the watcher gracefully closes in case
func TestWatchResumeInitRev(t *testing.T) {
defer testutil.AfterTest(t)
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
defer clus.Terminate(t)
cli := clus.Client(0)
if _, err := cli.Put(context.TODO(), "b", "2"); err != nil {
t.Fatal(err)
}
if _, err := cli.Put(context.TODO(), "a", "3"); err != nil {
t.Fatal(err)
}
// if resume is broken, it'll pick up this key first instead of a=3
if _, err := cli.Put(context.TODO(), "a", "4"); err != nil {
t.Fatal(err)
}
wch := clus.Client(0).Watch(context.Background(), "a", clientv3.WithRev(1), clientv3.WithCreatedNotify())
if resp, ok := <-wch; !ok || resp.Header.Revision != 4 {
t.Fatalf("got (%v, %v), expected create notification rev=4", resp, ok)
}
// pause wch
clus.Members[0].DropConnections()
clus.Members[0].PauseConnections()
select {
case resp, ok := <-wch:
t.Skipf("wch should block, got (%+v, %v); drop not fast enough", resp, ok)
case <-time.After(100 * time.Millisecond):
}
// resume wch
clus.Members[0].UnpauseConnections()
select {
case resp, ok := <-wch:
if !ok {
t.Fatal("unexpected watch close")
}
if len(resp.Events) == 0 {
t.Fatal("expected event on watch")
}
if string(resp.Events[0].Kv.Value) != "3" {
t.Fatalf("expected value=3, got event %+v", resp.Events[0])
}
case <-time.After(5 * time.Second):
t.Fatal("watch timed out")
}
}
// TestWatchResumeCompacted checks that the watcher gracefully closes in case
// that it tries to resume to a revision that's been compacted out of the store.
// Since the watcher's server restarts with stale data, the watcher will receive
// either a compaction error or all keys by staying in sync before the compaction
// is finally applied.
func TestWatchResumeCompacted(t *testing.T) {
defer testutil.AfterTest(t)
@ -377,8 +430,9 @@ func TestWatchResumeCompacted(t *testing.T) {
}
// put some data and compact away
numPuts := 5
kv := clientv3.NewKV(clus.Client(1))
for i := 0; i < 5; i++ {
for i := 0; i < numPuts; i++ {
if _, err := kv.Put(context.TODO(), "foo", "bar"); err != nil {
t.Fatal(err)
}
@ -389,17 +443,48 @@ func TestWatchResumeCompacted(t *testing.T) {
clus.Members[0].Restart(t)
// get compacted error message
wresp, ok := <-wch
if !ok {
t.Fatalf("expected wresp, but got closed channel")
// since watch's server isn't guaranteed to be synced with the cluster when
// the watch resumes, there is a window where the watch can stay synced and
// read off all events; if the watcher misses the window, it will go out of
// sync and get a compaction error.
wRev := int64(2)
for int(wRev) <= numPuts+1 {
var wresp clientv3.WatchResponse
var ok bool
select {
case wresp, ok = <-wch:
if !ok {
t.Fatalf("expected wresp, but got closed channel")
}
case <-time.After(5 * time.Second):
t.Fatalf("compacted watch timed out")
}
for _, ev := range wresp.Events {
if ev.Kv.ModRevision != wRev {
t.Fatalf("expected modRev %v, got %+v", wRev, ev)
}
wRev++
}
if wresp.Err() == nil {
continue
}
if wresp.Err() != rpctypes.ErrCompacted {
t.Fatalf("wresp.Err() expected %v, but got %v %+v", rpctypes.ErrCompacted, wresp.Err())
}
break
}
if wresp.Err() != rpctypes.ErrCompacted {
t.Fatalf("wresp.Err() expected %v, but got %v", rpctypes.ErrCompacted, wresp.Err())
if int(wRev) > numPuts+1 {
// got data faster than the compaction
return
}
// ensure the channel is closed
if wresp, ok = <-wch; ok {
t.Fatalf("expected closed channel, but got %v", wresp)
// received compaction error; ensure the channel closes
select {
case wresp, ok := <-wch:
if ok {
t.Fatalf("expected closed channel, but got %v", wresp)
}
case <-time.After(5 * time.Second):
t.Fatalf("timed out waiting for channel close")
}
}
@ -579,18 +664,19 @@ func TestWatchErrConnClosed(t *testing.T) {
defer clus.Terminate(t)
cli := clus.Client(0)
defer cli.Close()
wc := clientv3.NewWatcher(cli)
donec := make(chan struct{})
go func() {
defer close(donec)
wc.Watch(context.TODO(), "foo")
if err := wc.Close(); err != nil && err != grpc.ErrClientConnClosing {
t.Fatalf("expected %v, got %v", grpc.ErrClientConnClosing, err)
ch := wc.Watch(context.TODO(), "foo")
if wr := <-ch; grpc.ErrorDesc(wr.Err()) != grpc.ErrClientConnClosing.Error() {
t.Fatalf("expected %v, got %v", grpc.ErrClientConnClosing, grpc.ErrorDesc(wr.Err()))
}
}()
if err := cli.Close(); err != nil {
if err := cli.ActiveConnection().Close(); err != nil {
t.Fatal(err)
}
clus.TakeClient(0)
@ -637,8 +723,12 @@ func TestWatchWithRequireLeader(t *testing.T) {
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 3})
defer clus.Terminate(t)
// something for the non-require leader watch to read as an event
if _, err := clus.Client(1).Put(context.TODO(), "foo", "bar"); err != nil {
// Put a key for the non-require leader watch to read as an event.
// The watchers will be on member[0]; put key through member[0] to
// ensure that it receives the update so watching after killing quorum
// is guaranteed to have the key.
liveClient := clus.Client(0)
if _, err := liveClient.Put(context.TODO(), "foo", "bar"); err != nil {
t.Fatal(err)
}
@ -653,8 +743,8 @@ func TestWatchWithRequireLeader(t *testing.T) {
tickDuration := 10 * time.Millisecond
time.Sleep(time.Duration(3*clus.Members[0].ElectionTicks) * tickDuration)
chLeader := clus.Client(0).Watch(clientv3.WithRequireLeader(context.TODO()), "foo", clientv3.WithRev(1))
chNoLeader := clus.Client(0).Watch(context.TODO(), "foo", clientv3.WithRev(1))
chLeader := liveClient.Watch(clientv3.WithRequireLeader(context.TODO()), "foo", clientv3.WithRev(1))
chNoLeader := liveClient.Watch(context.TODO(), "foo", clientv3.WithRev(1))
select {
case resp, ok := <-chLeader:
@ -736,6 +826,36 @@ func TestWatchWithCreatedNotification(t *testing.T) {
}
}
// TestWatchWithCreatedNotificationDropConn ensures that
// a watcher with created notify does not post duplicate
// created events from disconnect.
func TestWatchWithCreatedNotificationDropConn(t *testing.T) {
cluster := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
defer cluster.Terminate(t)
client := cluster.RandClient()
wch := client.Watch(context.Background(), "a", clientv3.WithCreatedNotify())
resp := <-wch
if !resp.Created {
t.Fatalf("expected created event, got %v", resp)
}
cluster.Members[0].DropConnections()
// try to receive from watch channel again
// ensure it doesn't post another createNotify
select {
case wresp := <-wch:
t.Fatalf("got unexpected watch response: %+v\n", wresp)
case <-time.After(time.Second):
// watcher may not reconnect by the time it hits the select,
// so it wouldn't have a chance to filter out the second create event
}
}
// TestWatchCancelOnServer ensures client watcher cancels propagate back to the server.
func TestWatchCancelOnServer(t *testing.T) {
cluster := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
@ -868,3 +988,45 @@ func TestWatchCancelAndCloseClient(t *testing.T) {
<-donec
clus.TakeClient(0)
}
// TestWatchStressResumeClose establishes a bunch of watchers, disconnects
// to put them in resuming mode, cancels them so some resumes by cancel fail,
// then closes the watcher interface to ensure correct clean up.
func TestWatchStressResumeClose(t *testing.T) {
defer testutil.AfterTest(t)
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
defer clus.Terminate(t)
cli := clus.Client(0)
ctx, cancel := context.WithCancel(context.Background())
// add more watches than can be resumed before the cancel
wchs := make([]clientv3.WatchChan, 2000)
for i := range wchs {
wchs[i] = cli.Watch(ctx, "abc")
}
clus.Members[0].DropConnections()
cancel()
if err := cli.Close(); err != nil {
t.Fatal(err)
}
clus.TakeClient(0)
}
// TestWatchCancelDisconnected ensures canceling a watcher works when
// its grpc stream is disconnected / reconnecting.
func TestWatchCancelDisconnected(t *testing.T) {
defer testutil.AfterTest(t)
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
defer clus.Terminate(t)
cli := clus.Client(0)
ctx, cancel := context.WithCancel(context.Background())
// add more watches than can be resumed before the cancel
wch := cli.Watch(ctx, "abc")
clus.Members[0].Stop(t)
cancel()
select {
case <-wch:
case <-time.After(time.Second):
t.Fatal("took too long to cancel disconnected watcher")
}
}

View File

@ -105,7 +105,7 @@ func (kv *kv) Delete(ctx context.Context, key string, opts ...OpOption) (*Delete
}
func (kv *kv) Compact(ctx context.Context, rev int64, opts ...CompactOption) (*CompactResponse, error) {
resp, err := kv.remote.Compact(ctx, OpCompact(rev, opts...).toRequest(), grpc.FailFast(false))
resp, err := kv.remote.Compact(ctx, OpCompact(rev, opts...).toRequest())
if err != nil {
return nil, toErr(ctx, err)
}
@ -125,6 +125,7 @@ func (kv *kv) Do(ctx context.Context, op Op) (OpResponse, error) {
if err == nil {
return resp, nil
}
if isHaltErr(ctx, err) {
return resp, toErr(ctx, err)
}

View File

@ -69,6 +69,21 @@ const (
NoLease LeaseID = 0
)
// ErrKeepAliveHalted is returned if client keep alive loop halts with an unexpected error.
//
// This usually means that automatic lease renewal via KeepAlive is broken, but KeepAliveOnce will still work as expected.
type ErrKeepAliveHalted struct {
Reason error
}
func (e ErrKeepAliveHalted) Error() string {
s := "etcdclient: leases keep alive halted"
if e.Reason != nil {
s += ": " + e.Reason.Error()
}
return s
}
type Lease interface {
// Grant creates a new lease.
Grant(ctx context.Context, ttl int64) (*LeaseGrantResponse, error)
@ -94,8 +109,9 @@ type Lease interface {
type lessor struct {
mu sync.Mutex // guards all fields
// donec is closed when recvKeepAliveLoop stops
donec chan struct{}
// donec is closed and loopErr is set when recvKeepAliveLoop stops
donec chan struct{}
loopErr error
remote pb.LeaseClient
@ -161,9 +177,6 @@ func (l *lessor) Grant(ctx context.Context, ttl int64) (*LeaseGrantResponse, err
if isHaltErr(cctx, err) {
return nil, toErr(cctx, err)
}
if nerr := l.newStream(); nerr != nil {
return nil, nerr
}
}
}
@ -182,9 +195,6 @@ func (l *lessor) Revoke(ctx context.Context, id LeaseID) (*LeaseRevokeResponse,
if isHaltErr(ctx, err) {
return nil, toErr(ctx, err)
}
if nerr := l.newStream(); nerr != nil {
return nil, nerr
}
}
}
@ -195,7 +205,7 @@ func (l *lessor) TimeToLive(ctx context.Context, id LeaseID, opts ...LeaseOption
for {
r := toLeaseTimeToLiveRequest(id, opts...)
resp, err := l.remote.LeaseTimeToLive(cctx, r)
resp, err := l.remote.LeaseTimeToLive(cctx, r, grpc.FailFast(false))
if err == nil {
gresp := &LeaseTimeToLiveResponse{
ResponseHeader: resp.GetHeader(),
@ -216,6 +226,15 @@ func (l *lessor) KeepAlive(ctx context.Context, id LeaseID) (<-chan *LeaseKeepAl
ch := make(chan *LeaseKeepAliveResponse, leaseResponseChSize)
l.mu.Lock()
// ensure that recvKeepAliveLoop is still running
select {
case <-l.donec:
err := l.loopErr
l.mu.Unlock()
close(ch)
return ch, ErrKeepAliveHalted{Reason: err}
default:
}
ka, ok := l.keepAlives[id]
if !ok {
// create fresh keep alive
@ -255,10 +274,6 @@ func (l *lessor) KeepAliveOnce(ctx context.Context, id LeaseID) (*LeaseKeepAlive
if isHaltErr(ctx, err) {
return nil, toErr(ctx, err)
}
if nerr := l.newStream(); nerr != nil {
return nil, nerr
}
}
}
@ -327,10 +342,11 @@ func (l *lessor) keepAliveOnce(ctx context.Context, id LeaseID) (*LeaseKeepAlive
return karesp, nil
}
func (l *lessor) recvKeepAliveLoop() {
func (l *lessor) recvKeepAliveLoop() (gerr error) {
defer func() {
l.mu.Lock()
close(l.donec)
l.loopErr = gerr
for _, ka := range l.keepAlives {
ka.Close()
}
@ -343,21 +359,35 @@ func (l *lessor) recvKeepAliveLoop() {
resp, err := stream.Recv()
if err != nil {
if isHaltErr(l.stopCtx, err) {
return
return err
}
stream, serr = l.resetRecv()
continue
}
l.recvKeepAlive(resp)
}
return serr
}
// resetRecv opens a new lease stream and starts sending LeaseKeepAliveRequests
func (l *lessor) resetRecv() (pb.Lease_LeaseKeepAliveClient, error) {
if err := l.newStream(); err != nil {
sctx, cancel := context.WithCancel(l.stopCtx)
stream, err := l.remote.LeaseKeepAlive(sctx, grpc.FailFast(false))
if err = toErr(sctx, err); err != nil {
cancel()
return nil, err
}
stream := l.getKeepAliveStream()
l.mu.Lock()
defer l.mu.Unlock()
if l.stream != nil && l.streamCancel != nil {
l.stream.CloseSend()
l.streamCancel()
}
l.streamCancel = cancel
l.stream = stream
go l.sendKeepAliveLoop(stream)
return stream, nil
}
@ -386,7 +416,7 @@ func (l *lessor) recvKeepAlive(resp *pb.LeaseKeepAliveResponse) {
}
// send update to all channels
nextKeepAlive := time.Now().Add(1 + time.Duration(karesp.TTL/3)*time.Second)
nextKeepAlive := time.Now().Add((time.Duration(karesp.TTL) * time.Second) / 3.0)
ka.deadline = time.Now().Add(time.Duration(karesp.TTL) * time.Second)
for _, ch := range ka.chs {
select {
@ -453,32 +483,6 @@ func (l *lessor) sendKeepAliveLoop(stream pb.Lease_LeaseKeepAliveClient) {
}
}
func (l *lessor) getKeepAliveStream() pb.Lease_LeaseKeepAliveClient {
l.mu.Lock()
defer l.mu.Unlock()
return l.stream
}
func (l *lessor) newStream() error {
sctx, cancel := context.WithCancel(l.stopCtx)
stream, err := l.remote.LeaseKeepAlive(sctx, grpc.FailFast(false))
if err != nil {
cancel()
return toErr(sctx, err)
}
l.mu.Lock()
defer l.mu.Unlock()
if l.stream != nil && l.streamCancel != nil {
l.stream.CloseSend()
l.streamCancel()
}
l.streamCancel = cancel
l.stream = stream
return nil
}
func (ka *keepAlive) Close() {
close(ka.donec)
for _, ch := range ka.chs {

View File

@ -31,16 +31,16 @@ type GRPCResolver struct {
Client *etcd.Client
}
func (gr *GRPCResolver) Update(ctx context.Context, target string, nm naming.Update) (err error) {
func (gr *GRPCResolver) Update(ctx context.Context, target string, nm naming.Update, opts ...etcd.OpOption) (err error) {
switch nm.Op {
case naming.Add:
var v []byte
if v, err = json.Marshal(nm); err != nil {
return grpc.Errorf(codes.InvalidArgument, err.Error())
}
_, err = gr.Client.KV.Put(ctx, target+"/"+nm.Addr, string(v))
_, err = gr.Client.KV.Put(ctx, target+"/"+nm.Addr, string(v), opts...)
case naming.Delete:
_, err = gr.Client.Delete(ctx, target+"/"+nm.Addr)
_, err = gr.Client.Delete(ctx, target+"/"+nm.Addr, opts...)
default:
return grpc.Errorf(codes.InvalidArgument, "naming: bad naming op")
}

View File

@ -223,7 +223,7 @@ func WithSort(target SortTarget, order SortOrder) OpOption {
// If order != SortNone, server fetches the entire key-space,
// and then applies the sort and limit, if provided.
// Since current mvcc.Range implementation returns results
// sorted by keys in lexiographically ascending order,
// sorted by keys in lexicographically ascending order,
// client should ignore SortOrder if the target is SortByKey.
order = SortNone
}
@ -261,14 +261,15 @@ func WithPrefix() OpOption {
}
}
// WithRange specifies the range of 'Get' or 'Delete' requests.
// WithRange specifies the range of 'Get', 'Delete', 'Watch' requests.
// For example, 'Get' requests with 'WithRange(end)' returns
// the keys in the range [key, end).
// endKey must be lexicographically greater than start key.
func WithRange(endKey string) OpOption {
return func(op *Op) { op.end = []byte(endKey) }
}
// WithFromKey specifies the range of 'Get' or 'Delete' requests
// WithFromKey specifies the range of 'Get', 'Delete', 'Watch' requests
// to be equal or greater than the key in the argument.
func WithFromKey() OpOption { return WithRange("\x00") }

View File

@ -23,70 +23,109 @@ import (
)
type rpcFunc func(ctx context.Context) error
type retryRpcFunc func(context.Context, rpcFunc)
type retryRpcFunc func(context.Context, rpcFunc) error
func (c *Client) newRetryWrapper() retryRpcFunc {
return func(rpcCtx context.Context, f rpcFunc) {
return func(rpcCtx context.Context, f rpcFunc) error {
for {
err := f(rpcCtx)
if err == nil {
return
return nil
}
eErr := rpctypes.Error(err)
// always stop retry on etcd errors
if _, ok := eErr.(rpctypes.EtcdError); ok {
return err
}
// only retry if unavailable
if grpc.Code(err) != codes.Unavailable {
return
}
// always stop retry on etcd errors
eErr := rpctypes.Error(err)
if _, ok := eErr.(rpctypes.EtcdError); ok {
return
return err
}
select {
case <-c.balancer.ConnectNotify():
case <-rpcCtx.Done():
return rpcCtx.Err()
case <-c.ctx.Done():
return
return c.ctx.Err()
}
}
}
}
type retryKVClient struct {
pb.KVClient
retryf retryRpcFunc
func (c *Client) newAuthRetryWrapper() retryRpcFunc {
return func(rpcCtx context.Context, f rpcFunc) error {
for {
err := f(rpcCtx)
if err == nil {
return nil
}
// always stop retry on etcd errors other than invalid auth token
if rpctypes.Error(err) == rpctypes.ErrInvalidAuthToken {
gterr := c.getToken(rpcCtx)
if gterr != nil {
return err // return the original error for simplicity
}
continue
}
return err
}
}
}
// RetryKVClient implements a KVClient that uses the client's FailFast retry policy.
func RetryKVClient(c *Client) pb.KVClient {
return &retryKVClient{pb.NewKVClient(c.conn), c.retryWrapper}
retryWrite := &retryWriteKVClient{pb.NewKVClient(c.conn), c.retryWrapper}
return &retryKVClient{&retryWriteKVClient{retryWrite, c.retryAuthWrapper}}
}
func (rkv *retryKVClient) Put(ctx context.Context, in *pb.PutRequest, opts ...grpc.CallOption) (resp *pb.PutResponse, err error) {
rkv.retryf(ctx, func(rctx context.Context) error {
type retryKVClient struct {
*retryWriteKVClient
}
func (rkv *retryKVClient) Range(ctx context.Context, in *pb.RangeRequest, opts ...grpc.CallOption) (resp *pb.RangeResponse, err error) {
err = rkv.retryf(ctx, func(rctx context.Context) error {
resp, err = rkv.retryWriteKVClient.Range(rctx, in, opts...)
return err
})
return resp, err
}
type retryWriteKVClient struct {
pb.KVClient
retryf retryRpcFunc
}
func (rkv *retryWriteKVClient) Put(ctx context.Context, in *pb.PutRequest, opts ...grpc.CallOption) (resp *pb.PutResponse, err error) {
err = rkv.retryf(ctx, func(rctx context.Context) error {
resp, err = rkv.KVClient.Put(rctx, in, opts...)
return err
})
return resp, err
}
func (rkv *retryKVClient) DeleteRange(ctx context.Context, in *pb.DeleteRangeRequest, opts ...grpc.CallOption) (resp *pb.DeleteRangeResponse, err error) {
rkv.retryf(ctx, func(rctx context.Context) error {
func (rkv *retryWriteKVClient) DeleteRange(ctx context.Context, in *pb.DeleteRangeRequest, opts ...grpc.CallOption) (resp *pb.DeleteRangeResponse, err error) {
err = rkv.retryf(ctx, func(rctx context.Context) error {
resp, err = rkv.KVClient.DeleteRange(rctx, in, opts...)
return err
})
return resp, err
}
func (rkv *retryKVClient) Txn(ctx context.Context, in *pb.TxnRequest, opts ...grpc.CallOption) (resp *pb.TxnResponse, err error) {
rkv.retryf(ctx, func(rctx context.Context) error {
func (rkv *retryWriteKVClient) Txn(ctx context.Context, in *pb.TxnRequest, opts ...grpc.CallOption) (resp *pb.TxnResponse, err error) {
err = rkv.retryf(ctx, func(rctx context.Context) error {
resp, err = rkv.KVClient.Txn(rctx, in, opts...)
return err
})
return resp, err
}
func (rkv *retryKVClient) Compact(ctx context.Context, in *pb.CompactionRequest, opts ...grpc.CallOption) (resp *pb.CompactionResponse, err error) {
rkv.retryf(ctx, func(rctx context.Context) error {
func (rkv *retryWriteKVClient) Compact(ctx context.Context, in *pb.CompactionRequest, opts ...grpc.CallOption) (resp *pb.CompactionResponse, err error) {
err = rkv.retryf(ctx, func(rctx context.Context) error {
resp, err = rkv.KVClient.Compact(rctx, in, opts...)
return err
})
@ -100,11 +139,12 @@ type retryLeaseClient struct {
// RetryLeaseClient implements a LeaseClient that uses the client's FailFast retry policy.
func RetryLeaseClient(c *Client) pb.LeaseClient {
return &retryLeaseClient{pb.NewLeaseClient(c.conn), c.retryWrapper}
retry := &retryLeaseClient{pb.NewLeaseClient(c.conn), c.retryWrapper}
return &retryLeaseClient{retry, c.retryAuthWrapper}
}
func (rlc *retryLeaseClient) LeaseGrant(ctx context.Context, in *pb.LeaseGrantRequest, opts ...grpc.CallOption) (resp *pb.LeaseGrantResponse, err error) {
rlc.retryf(ctx, func(rctx context.Context) error {
err = rlc.retryf(ctx, func(rctx context.Context) error {
resp, err = rlc.LeaseClient.LeaseGrant(rctx, in, opts...)
return err
})
@ -113,7 +153,7 @@ func (rlc *retryLeaseClient) LeaseGrant(ctx context.Context, in *pb.LeaseGrantRe
}
func (rlc *retryLeaseClient) LeaseRevoke(ctx context.Context, in *pb.LeaseRevokeRequest, opts ...grpc.CallOption) (resp *pb.LeaseRevokeResponse, err error) {
rlc.retryf(ctx, func(rctx context.Context) error {
err = rlc.retryf(ctx, func(rctx context.Context) error {
resp, err = rlc.LeaseClient.LeaseRevoke(rctx, in, opts...)
return err
})
@ -131,7 +171,7 @@ func RetryClusterClient(c *Client) pb.ClusterClient {
}
func (rcc *retryClusterClient) MemberAdd(ctx context.Context, in *pb.MemberAddRequest, opts ...grpc.CallOption) (resp *pb.MemberAddResponse, err error) {
rcc.retryf(ctx, func(rctx context.Context) error {
err = rcc.retryf(ctx, func(rctx context.Context) error {
resp, err = rcc.ClusterClient.MemberAdd(rctx, in, opts...)
return err
})
@ -139,7 +179,7 @@ func (rcc *retryClusterClient) MemberAdd(ctx context.Context, in *pb.MemberAddRe
}
func (rcc *retryClusterClient) MemberRemove(ctx context.Context, in *pb.MemberRemoveRequest, opts ...grpc.CallOption) (resp *pb.MemberRemoveResponse, err error) {
rcc.retryf(ctx, func(rctx context.Context) error {
err = rcc.retryf(ctx, func(rctx context.Context) error {
resp, err = rcc.ClusterClient.MemberRemove(rctx, in, opts...)
return err
})
@ -147,7 +187,7 @@ func (rcc *retryClusterClient) MemberRemove(ctx context.Context, in *pb.MemberRe
}
func (rcc *retryClusterClient) MemberUpdate(ctx context.Context, in *pb.MemberUpdateRequest, opts ...grpc.CallOption) (resp *pb.MemberUpdateResponse, err error) {
rcc.retryf(ctx, func(rctx context.Context) error {
err = rcc.retryf(ctx, func(rctx context.Context) error {
resp, err = rcc.ClusterClient.MemberUpdate(rctx, in, opts...)
return err
})
@ -165,7 +205,7 @@ func RetryAuthClient(c *Client) pb.AuthClient {
}
func (rac *retryAuthClient) AuthEnable(ctx context.Context, in *pb.AuthEnableRequest, opts ...grpc.CallOption) (resp *pb.AuthEnableResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.AuthEnable(rctx, in, opts...)
return err
})
@ -173,7 +213,7 @@ func (rac *retryAuthClient) AuthEnable(ctx context.Context, in *pb.AuthEnableReq
}
func (rac *retryAuthClient) AuthDisable(ctx context.Context, in *pb.AuthDisableRequest, opts ...grpc.CallOption) (resp *pb.AuthDisableResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.AuthDisable(rctx, in, opts...)
return err
})
@ -181,7 +221,7 @@ func (rac *retryAuthClient) AuthDisable(ctx context.Context, in *pb.AuthDisableR
}
func (rac *retryAuthClient) UserAdd(ctx context.Context, in *pb.AuthUserAddRequest, opts ...grpc.CallOption) (resp *pb.AuthUserAddResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.UserAdd(rctx, in, opts...)
return err
})
@ -189,7 +229,7 @@ func (rac *retryAuthClient) UserAdd(ctx context.Context, in *pb.AuthUserAddReque
}
func (rac *retryAuthClient) UserDelete(ctx context.Context, in *pb.AuthUserDeleteRequest, opts ...grpc.CallOption) (resp *pb.AuthUserDeleteResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.UserDelete(rctx, in, opts...)
return err
})
@ -197,7 +237,7 @@ func (rac *retryAuthClient) UserDelete(ctx context.Context, in *pb.AuthUserDelet
}
func (rac *retryAuthClient) UserChangePassword(ctx context.Context, in *pb.AuthUserChangePasswordRequest, opts ...grpc.CallOption) (resp *pb.AuthUserChangePasswordResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.UserChangePassword(rctx, in, opts...)
return err
})
@ -205,7 +245,7 @@ func (rac *retryAuthClient) UserChangePassword(ctx context.Context, in *pb.AuthU
}
func (rac *retryAuthClient) UserGrantRole(ctx context.Context, in *pb.AuthUserGrantRoleRequest, opts ...grpc.CallOption) (resp *pb.AuthUserGrantRoleResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.UserGrantRole(rctx, in, opts...)
return err
})
@ -213,7 +253,7 @@ func (rac *retryAuthClient) UserGrantRole(ctx context.Context, in *pb.AuthUserGr
}
func (rac *retryAuthClient) UserRevokeRole(ctx context.Context, in *pb.AuthUserRevokeRoleRequest, opts ...grpc.CallOption) (resp *pb.AuthUserRevokeRoleResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.UserRevokeRole(rctx, in, opts...)
return err
})
@ -221,7 +261,7 @@ func (rac *retryAuthClient) UserRevokeRole(ctx context.Context, in *pb.AuthUserR
}
func (rac *retryAuthClient) RoleAdd(ctx context.Context, in *pb.AuthRoleAddRequest, opts ...grpc.CallOption) (resp *pb.AuthRoleAddResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.RoleAdd(rctx, in, opts...)
return err
})
@ -229,7 +269,7 @@ func (rac *retryAuthClient) RoleAdd(ctx context.Context, in *pb.AuthRoleAddReque
}
func (rac *retryAuthClient) RoleDelete(ctx context.Context, in *pb.AuthRoleDeleteRequest, opts ...grpc.CallOption) (resp *pb.AuthRoleDeleteResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.RoleDelete(rctx, in, opts...)
return err
})
@ -237,7 +277,7 @@ func (rac *retryAuthClient) RoleDelete(ctx context.Context, in *pb.AuthRoleDelet
}
func (rac *retryAuthClient) RoleGrantPermission(ctx context.Context, in *pb.AuthRoleGrantPermissionRequest, opts ...grpc.CallOption) (resp *pb.AuthRoleGrantPermissionResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.RoleGrantPermission(rctx, in, opts...)
return err
})
@ -245,7 +285,7 @@ func (rac *retryAuthClient) RoleGrantPermission(ctx context.Context, in *pb.Auth
}
func (rac *retryAuthClient) RoleRevokePermission(ctx context.Context, in *pb.AuthRoleRevokePermissionRequest, opts ...grpc.CallOption) (resp *pb.AuthRoleRevokePermissionResponse, err error) {
rac.retryf(ctx, func(rctx context.Context) error {
err = rac.retryf(ctx, func(rctx context.Context) error {
resp, err = rac.AuthClient.RoleRevokePermission(rctx, in, opts...)
return err
})

View File

@ -126,14 +126,14 @@ type watchGrpcStream struct {
reqc chan *watchRequest
// respc receives data from the watch client
respc chan *pb.WatchResponse
// stopc is sent to the main goroutine to stop all processing
stopc chan struct{}
// donec closes to broadcast shutdown
donec chan struct{}
// errc transmits errors from grpc Recv to the watch stream reconn logic
errc chan error
// closingc gets the watcherStream of closing watchers
closingc chan *watcherStream
// wg is Done when all substream goroutines have exited
wg sync.WaitGroup
// resumec closes to signal that all substreams should begin resuming
resumec chan struct{}
@ -213,7 +213,6 @@ func (w *watcher) newWatcherGrpcStream(inctx context.Context) *watchGrpcStream {
respc: make(chan *pb.WatchResponse),
reqc: make(chan *watchRequest),
stopc: make(chan struct{}),
donec: make(chan struct{}),
errc: make(chan error, 1),
closingc: make(chan *watcherStream),
@ -319,7 +318,7 @@ func (w *watcher) Close() (err error) {
}
func (w *watchGrpcStream) Close() (err error) {
close(w.stopc)
w.cancel()
<-w.donec
select {
case err = <-w.errc:
@ -366,7 +365,7 @@ func (w *watchGrpcStream) closeSubstream(ws *watcherStream) {
// close subscriber's channel
if closeErr := w.closeErr; closeErr != nil && ws.initReq.ctx.Err() == nil {
go w.sendCloseSubstream(ws, &WatchResponse{closeErr: w.closeErr})
} else {
} else if ws.outc != nil {
close(ws.outc)
}
if ws.id != -1 {
@ -396,18 +395,20 @@ func (w *watchGrpcStream) run() {
for _, ws := range w.substreams {
if _, ok := closing[ws]; !ok {
close(ws.recvc)
closing[ws] = struct{}{}
}
}
for _, ws := range w.resuming {
if _, ok := closing[ws]; ws != nil && !ok {
close(ws.recvc)
closing[ws] = struct{}{}
}
}
w.joinSubstreams()
for toClose := len(w.substreams) + len(w.resuming); toClose > 0; toClose-- {
for range closing {
w.closeSubstream(<-w.closingc)
}
w.wg.Wait()
w.owner.closeStream(w)
}()
@ -432,6 +433,7 @@ func (w *watchGrpcStream) run() {
}
ws.donec = make(chan struct{})
w.wg.Add(1)
go w.serveSubstream(ws, w.resumec)
// queue up for watcher creation/resume
@ -491,7 +493,7 @@ func (w *watchGrpcStream) run() {
wc.Send(ws.initReq.toPB())
}
cancelSet = make(map[int64]struct{})
case <-w.stopc:
case <-w.ctx.Done():
return
case ws := <-w.closingc:
w.closeSubstream(ws)
@ -577,6 +579,7 @@ func (w *watchGrpcStream) serveSubstream(ws *watcherStream, resumec chan struct{
if !resuming {
w.closingc <- ws
}
w.wg.Done()
}()
emptyWr := &WatchResponse{}
@ -584,17 +587,6 @@ func (w *watchGrpcStream) serveSubstream(ws *watcherStream, resumec chan struct{
curWr := emptyWr
outc := ws.outc
if len(ws.buf) > 0 && ws.buf[0].Created {
select {
case ws.initReq.retc <- ws.outc:
// send first creation event and only if requested
if !ws.initReq.createdNotify {
ws.buf = ws.buf[1:]
}
default:
}
}
if len(ws.buf) > 0 {
curWr = ws.buf[0]
} else {
@ -612,13 +604,51 @@ func (w *watchGrpcStream) serveSubstream(ws *watcherStream, resumec chan struct{
// shutdown from closeSubstream
return
}
// TODO pause channel if buffer gets too large
ws.buf = append(ws.buf, wr)
nextRev = wr.Header.Revision
if wr.Created {
if ws.initReq.retc != nil {
ws.initReq.retc <- ws.outc
// to prevent next write from taking the slot in buffered channel
// and posting duplicate create events
ws.initReq.retc = nil
// send first creation event only if requested
if ws.initReq.createdNotify {
ws.outc <- *wr
}
// once the watch channel is returned, a current revision
// watch must resume at the store revision. This is necessary
// for the following case to work as expected:
// wch := m1.Watch("a")
// m2.Put("a", "b")
// <-wch
// If the revision is only bound on the first observed event,
// if wch is disconnected before the Put is issued, then reconnects
// after it is committed, it'll miss the Put.
if ws.initReq.rev == 0 {
nextRev = wr.Header.Revision
}
}
} else {
// current progress of watch; <= store revision
nextRev = wr.Header.Revision
}
if len(wr.Events) > 0 {
nextRev = wr.Events[len(wr.Events)-1].Kv.ModRevision + 1
}
ws.initReq.rev = nextRev
// created event is already sent above,
// watcher should not post duplicate events
if wr.Created {
continue
}
// TODO pause channel if buffer gets too large
ws.buf = append(ws.buf, wr)
case <-w.ctx.Done():
return
case <-ws.initReq.ctx.Done():
return
case <-resumec:
@ -630,34 +660,83 @@ func (w *watchGrpcStream) serveSubstream(ws *watcherStream, resumec chan struct{
}
func (w *watchGrpcStream) newWatchClient() (pb.Watch_WatchClient, error) {
// connect to grpc stream
// mark all substreams as resuming
close(w.resumec)
w.resumec = make(chan struct{})
w.joinSubstreams()
for _, ws := range w.substreams {
ws.id = -1
w.resuming = append(w.resuming, ws)
}
// strip out nils, if any
var resuming []*watcherStream
for _, ws := range w.resuming {
if ws != nil {
resuming = append(resuming, ws)
}
}
w.resuming = resuming
w.substreams = make(map[int64]*watcherStream)
// connect to grpc stream while accepting watcher cancelation
stopc := make(chan struct{})
donec := w.waitCancelSubstreams(stopc)
wc, err := w.openWatchClient()
close(stopc)
<-donec
// serve all non-closing streams, even if there's a client error
// so that the teardown path can shutdown the streams as expected.
for _, ws := range w.resuming {
if ws.closing {
continue
}
ws.donec = make(chan struct{})
w.wg.Add(1)
go w.serveSubstream(ws, w.resumec)
}
if err != nil {
return nil, v3rpc.Error(err)
}
// mark all substreams as resuming
if len(w.substreams)+len(w.resuming) > 0 {
close(w.resumec)
w.resumec = make(chan struct{})
w.joinSubstreams()
for _, ws := range w.substreams {
ws.id = -1
w.resuming = append(w.resuming, ws)
}
for _, ws := range w.resuming {
if ws == nil || ws.closing {
continue
}
ws.donec = make(chan struct{})
go w.serveSubstream(ws, w.resumec)
}
}
w.substreams = make(map[int64]*watcherStream)
// receive data from new grpc stream
go w.serveWatchClient(wc)
return wc, nil
}
func (w *watchGrpcStream) waitCancelSubstreams(stopc <-chan struct{}) <-chan struct{} {
var wg sync.WaitGroup
wg.Add(len(w.resuming))
donec := make(chan struct{})
for i := range w.resuming {
go func(ws *watcherStream) {
defer wg.Done()
if ws.closing {
if ws.initReq.ctx.Err() != nil && ws.outc != nil {
close(ws.outc)
ws.outc = nil
}
return
}
select {
case <-ws.initReq.ctx.Done():
// closed ws will be removed from resuming
ws.closing = true
close(ws.outc)
ws.outc = nil
go func() { w.closingc <- ws }()
case <-stopc:
}
}(w.resuming[i])
}
go func() {
defer close(donec)
wg.Wait()
}()
return donec
}
// joinSubstream waits for all substream goroutines to complete
func (w *watchGrpcStream) joinSubstreams() {
for _, ws := range w.substreams {
@ -674,9 +753,9 @@ func (w *watchGrpcStream) joinSubstreams() {
func (w *watchGrpcStream) openWatchClient() (ws pb.Watch_WatchClient, err error) {
for {
select {
case <-w.stopc:
case <-w.ctx.Done():
if err == nil {
return nil, context.Canceled
return nil, w.ctx.Err()
}
return nil, err
default:

View File

@ -1,27 +0,0 @@
Copyright (c) 2009-2011 Andreas Krennmair. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Andreas Krennmair nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Some files were not shown because too many files have changed in this diff Show More