Compare commits

..

185 Commits

Author SHA1 Message Date
e475a4ea71 Merge pull request #8120 from heyitsanthony/restore-set-size-metric
mvcc: set db size metric on restore
2017-06-16 12:37:08 -07:00
7f149d8fb6 mvcc: set db size metric on restore
Fixes #8080
2017-06-16 11:27:34 -07:00
a825709940 integration: test mvcc db size metric is set on restore 2017-06-16 11:27:07 -07:00
1acc8090e3 Merge pull request #8110 from heyitsanthony/fix-test-sync-timeout
etcdserver: use RecorderStream for TestSyncTimeout to avoid missing action
2017-06-15 20:49:10 -07:00
e962b0c849 Merge pull request #7909 from heyitsanthony/unptr-cfg
etcdserver, embed, integration: don't use pointer for ServerConfig
2017-06-15 20:47:30 -07:00
44a6c2121b Merge pull request #7999 from hexfusion/grpc-gateway-auth
auth: support "authorization" token for grpc-gateway
2017-06-15 19:22:00 -07:00
8fa96cb303 Merge pull request #8113 from heyitsanthony/code-of-conduct
*: add code of conduct
2017-06-15 19:18:24 -07:00
42584f84b4 *: add code of conduct
github community insights complains there isn't one
2017-06-15 17:04:45 -07:00
03ab4d9cc5 Merge pull request #8108 from radhikapc/building-qa
etcd/Documentation/dl_build.md: removed an extra step for testing etcd
2017-06-15 16:48:50 -07:00
5fedaf2dd7 Merge pull request #7896 from gyuho/metadata-grpc
*: gRPC v1.4.1, gateway v1.2.2, metadata Incoming/OutgoingContext
2017-06-15 16:42:55 -07:00
5e059fd8dc *: use metadata Incoming/OutgoingContext
Fix https://github.com/coreos/etcd/issues/7888.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-15 16:41:23 -07:00
0d0c0f3959 bill-of-materials: add google.golang.org/genproto
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-15 16:41:23 -07:00
5fe58228b4 vendor: update grpc-go v1.4.1, grpc-gateway v1.2.2
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-15 16:41:19 -07:00
b9a53db0c2 Merge pull request #8101 from gyuho/randomize-renew
lease: randomize expiry on initial refresh call
2017-06-15 16:29:47 -07:00
639687bb89 Merge pull request #8112 from gyuho/speakeasy-dep
vendor: use tagged release 'bgentry/speakeasy'
2017-06-15 16:10:21 -07:00
15b86d064d vendor: use tagged release 'bgentry/speakeasy'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-15 16:09:46 -07:00
b6b56160cd Merge pull request #8111 from heyitsanthony/version-probing
vendor: update glide.yaml to use probing 0.0.1
2017-06-15 16:02:09 -07:00
703893f334 Merge pull request #8109 from heyitsanthony/faq-initial-peers
Documentation: update FAQ with entry about changing peer advertising
2017-06-15 16:01:19 -07:00
099952136a Merge pull request #8107 from heyitsanthony/lock-faster
concurrency: fetch current lock holder when creating waitlist key
2017-06-15 15:12:08 -07:00
52afc03d68 Documentation: removed an extra step for testing etcd
removed an extra step for testing etce build that might confuse the user of the flow. minimal editing to the doc
2017-06-15 14:39:10 -07:00
6e74c335e2 vendor: update glide.yaml to use probing 0.0.1
Also ignores appengine import from the grpc-gateway examples which
were causing glide errors on x/crypto when fetching imports.
2017-06-15 14:22:20 -07:00
aa0e6b26c0 etcdserver: use RecorderStream for TestSyncTimeout to avoid missing action 2017-06-15 13:43:53 -07:00
44422f3898 Documentation: update FAQ with entry about changing peer advertising
Been seeing this somewhat frequently.
2017-06-15 13:31:25 -07:00
dcf52bbfac etcdserver, embed, integration: don't use pointer for ServerConfig
ServerConfig is owned by etdcserver and unshared, so don't pass or store by
pointer. Also removes duplicated field 'snapCount'.
2017-06-15 13:02:13 -07:00
95bc33f37f integration: remove lease exist checking on randomized expiry
Lease with TTL 5 should be renewed with randomization,
thus it's still possible to exist after 3 seconds.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-15 12:57:55 -07:00
5bba05703c lease: randomize expiry on initial refresh call
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address https://github.com/coreos/etcd/issues/8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-15 12:57:49 -07:00
037e33e833 Merge pull request #8093 from gyuho/grafana
Documentation/op-guide: fix failed RPC rate, leader election metrics
2017-06-15 11:59:03 -07:00
1748fe3eda Documentation/op-guide: fix failed RPC rate, leader election metrics
This fixes failed RPC rate query, where we do not need
subtraction because we already query by the status code.
Also adds grpc_method to make it more specific. Most of the
time, the failure recovers within 10-second, which is our
Prometheus scrap interval, so 'rate' query might not cover
that time window, showing as 0s, but still shows up in the graph.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-15 11:54:18 -07:00
f5b96991a1 concurrency: fetch current lock holder when creating waitlist key
The uncontended path for a mutex would fetch the minimum
revision key on the prefix after creating its entry in
the wait list. This fetch can be rolled into the txn for
creating the wait key, eliminating a round-trip for immediately
acquiring the lock.
2017-06-15 11:29:34 -07:00
1f206c027a Merge pull request #8106 from heyitsanthony/clarify-watch-comment
clientv3: clarify Watch close conditions
2017-06-15 10:56:25 -07:00
3a37b68cda Merge pull request #8105 from nkovacs/its
Documentation: grammar fixes, it's -> its
2017-06-15 10:46:20 -07:00
c27634c215 e2e: test auth over grpc json 2017-06-15 13:41:47 -04:00
e5aa938fec scripts: generate swagger with authorization support 2017-06-15 13:41:43 -04:00
13d9438cf9 clientv3: clarify Watch close conditions
The "too slow" comment is rather vague. If the server closes
the watch for being too slow (it doesn't seem to any more), the
watch client should gracefully resume instead of forcing the
user to handle it.

Also removed the 'opts' comment since it wasn't being maintained.
2017-06-15 09:34:00 -07:00
66687da3ba *: grammar fixes, it's -> its 2017-06-15 18:23:16 +02:00
0caab26310 auth: support "authorization" token for grpc-gateway 2017-06-14 20:11:39 -04:00
ee0c805de2 Merge pull request #8099 from gyuho/rate-limit-lease-expiration
lease: rate limit revoke runLoop
2017-06-14 15:39:58 -07:00
0011b78bd5 lease: rate limit revoke runLoop
Fix https://github.com/coreos/etcd/issues/8097.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-14 14:22:16 -07:00
e6d26675e6 Merge pull request #8090 from glevand/for-merge-aci
build-aci: Fix ACI image name
2017-06-14 08:41:58 -07:00
e402606f02 build-aci: Fix ACI image name
The appc discovery spec states that the architecture specifier in the ACI
image file name will be an ACI architecture value.  Our build scripts were
using GOARCH in the image name, which is incorrect for arm64/aarch64.
See: https://github.com/appc/spec/blob/master/spec/discovery.md

Fixes errors like these on arm64 machines:

  $ rkt --debug --insecure-options=image fetch coreos.com/etcd:v3.2.0-rc.1
  image: remote fetching from URL "https://github.com/coreos/etcd/releases/download/v3.2.0-rc.1/etcd-v3.2.0-rc.1-linux-aarch64.aci"
  fetch: bad HTTP status code: 404

Signed-off-by: Geoff Levand <geoff@infradead.org>
2017-06-13 13:09:02 -07:00
750dc7f157 Merge pull request #8088 from jbowens/snap-example
contrib/raftexample: save snapshot to WAL first
2017-06-13 12:44:13 -07:00
74e020b715 contrib/raftexample: save snapshot to WAL first
Save the snapshot index to the WAL before saving the snapshot to the
filesystem. This ensures that we'll only ever call wal.Open with a
snapshot that was previously saved to the WAL.
2017-06-13 11:24:07 -07:00
3993f37a26 Merge pull request #8081 from WIZARD-CXY/master
Documentation: alert.rules. fix labels bug
2017-06-13 10:56:04 -07:00
e006e2dbcb Merge pull request #8087 from gyuho/bom
bill-of-materials: regenerate with multi licenses
2017-06-13 10:46:06 -07:00
a7c33d48de bill-of-materials: regenerate with multi licenses
Fix https://github.com/coreos/etcd/issues/8086.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-13 10:25:29 -07:00
4445996a38 Merge pull request #8084 from heyitsanthony/update-protobuf
vendor: update github.com/{gogo,golang}/protobuf
2017-06-12 19:09:49 -07:00
5ae04259c4 Documentation: alert.rules. fix labels bug 2017-06-13 09:33:13 +08:00
b7741c6ecf Merge pull request #8083 from heyitsanthony/initial-cluster-warning
etcdserver: better warning when initial-cluster doesn't match advertise urls
2017-06-12 15:15:08 -07:00
4ebeba0e18 *: regen protofiles with latest protobuf tools 2017-06-12 15:14:43 -07:00
2afd0a726f vendor: update github.com/gogo/protobuf and github.com/golang/protobuf 2017-06-12 14:26:15 -07:00
7ff5b05004 etcdserver: better warning when initial-cluster doesn't match advertise urls
The old error was not clear about what URLs needed to be added, sometimes
truncating the list. To make it clearer, print out the missing entries
for --initial-cluster and print the full list of initial advertise peers.

Fixes #8079 and #7927
2017-06-12 14:14:16 -07:00
933aa09b73 Merge pull request #8070 from heyitsanthony/etcdctl-cluster-health
ctlv2: report unhealthy in cluster-health if any node is unavailable
2017-06-09 14:57:03 -07:00
3fcb8336aa e2e: update cluster-health test for new etcdctl output 2017-06-09 13:55:16 -07:00
b194276289 Merge pull request #8075 from gyuho/upgrade-doc
Documentation/upgrades: link to previous guides
2017-06-09 13:02:57 -07:00
7f3127441b Documentation/upgrades: link to previous guides
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-09 12:33:39 -07:00
3a6180d490 Merge pull request #8072 from heyitsanthony/auth-proxy-test
integration: test auth layer in grpc proxy tests
2017-06-09 11:32:27 -07:00
bdddbcc414 Merge pull request #8074 from heyitsanthony/no-limit-snapshot
rafthttp: permit very large v2 snapshots
2017-06-09 11:12:25 -07:00
84e6aaff66 Merge pull request #7995 from gyuho/NEWS
NEWS: add v3.2.0
2017-06-09 11:08:29 -07:00
d5b917daad Merge pull request #8069 from heyitsanthony/fix-watch-bench
benchmark: refactor watch benchmark
2017-06-09 11:04:20 -07:00
ad0b3cfdab ctlv2: report unhealthy in cluster-health if any node is unavailable
Fixes #8061 and #7032
2017-06-09 10:57:17 -07:00
9543431aeb rafthttp: permit very large v2 snapshots
v2 snapshots were hitting the 512MB message decode limit, causing
sending snapshots to new members to fail for being too big.
2017-06-09 10:41:27 -07:00
d6750158fb NEWS: add v3.2.0
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-09 10:37:27 -07:00
56841bbc5f Merge pull request #8071 from heyitsanthony/txn-rev
etcdserver: use same ReadView for read-only txns
2017-06-09 09:43:18 -07:00
798119ed6f integration: test auth layer in grpcproxy tests 2017-06-09 09:36:16 -07:00
5bb0a091fc adapter: auth server to client adapter 2017-06-09 09:36:16 -07:00
d173b09a1b etcdserver: use same ReadView for read-only txns
A read-only txn isn't serialized by raft, but it uses a fresh
read txn for every mvcc access prior to executing its request ops.
If a write txn modifies the keys matching the read txn's comparisons,
the read txn may return inconsistent results.

To fix, use the same read-only mvcc txn for the duration of the etcd
txn. Probably gets a modest txn speedup as well since there are
fewer read txn allocations.
2017-06-09 09:20:38 -07:00
da48f1feaf mvcc: create TxnWrites from TxnRead with NewReadOnlyTxnWrite
Already used internally by mvcc, but needed by etcdserver txns.
2017-06-09 09:20:38 -07:00
ad22aaa354 integration: test txn comparison and concurrent put ordering 2017-06-09 09:20:38 -07:00
3b460506d9 Merge pull request #8067 from gyuho/docker-doc
Documentation/op-guide: do not use host network, fix indentation
2017-06-09 09:14:00 -07:00
56db7e56f9 benchmark: refactor watch benchmark 2017-06-08 21:14:08 -07:00
a8c073c51e Merge pull request #8066 from fanminshi/keepAlive_Close_to_close
clientv3: change Close() to close() for keepAlive and watchGrpcStream
2017-06-08 14:59:24 -07:00
762b2c625c clientv3: change watchGrpcStream Close() to close()
private struct shouldn't have public method.
2017-06-08 12:11:06 -07:00
74a2b2e873 Documentation/op-guide: do not use host network, fix indentation
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-08 12:09:12 -07:00
2caae60004 Merge pull request #8062 from heyitsanthony/revert-v2machines
v2http: put back /v2/machines and mark as non-deprecated
2017-06-08 12:01:58 -07:00
4dff7aaa2a clientv3: change keepAlive Close() to close()
keepAlive is a private struct that belongs to clientv3 pkg and shouldn't expose a public Close() method.
2017-06-08 11:53:59 -07:00
9ffdb3a59e Merge pull request #8064 from gyuho/lease-expiration-metrics
etcdserver: add leaseExpired metrics
2017-06-08 11:13:52 -07:00
300feea177 Merge pull request #8052 from heyitsanthony/watch-victim-test
mvcc: test watch victim/delay path
2017-06-08 11:10:33 -07:00
45fd8279f0 etcdserver: add leaseExpired debugging metrics
Fix https://github.com/coreos/etcd/issues/8050.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-08 10:36:25 -07:00
d335821c51 Merge pull request #8063 from gyuho/met
Documentation/op-guide: fix 'grpc_code' field in metrics
2017-06-08 10:15:42 -07:00
c6330d86f1 Documentation/op-guide: fix 'grpc_code' field in metrics
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-08 09:43:30 -07:00
c2dadbd9f8 v2http: put back /v2/machines and mark as non-deprecated
This reverts commit 2bb33181b6. python-etcd
seems to depend on /v2/machines and the maintainer vanished. Plus, it is
prefixed with /v2/ so it probably can't be deprecated anyway.
2017-06-08 09:39:11 -07:00
eb3622942b Merge pull request #8055 from gyuho/aaa
Documentation/op-guide: fix markdown highlight syntax
2017-06-08 07:33:04 -07:00
fa4903c83c Merge pull request #8031 from mitake/lease-revoke-auth
protecting lease revoking with auth
2017-06-08 13:34:14 +09:00
aaa9e1735a Documentation/op-guide: fix markdown highlight syntax
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-07 20:36:22 -07:00
3df9352c00 Merge pull request #8054 from heyitsanthony/txn-metric
mvcc: count range/put/del operations for txns
2017-06-07 19:19:32 -07:00
8f8f79db56 Merge pull request #8053 from heyitsanthony/jwt-test
auth: JWT tests
2017-06-07 19:15:18 -07:00
7b68318284 integration: add test cases for lease revoking with auth 2017-06-07 17:46:14 -07:00
0c655902f2 auth, etcdserver: protect revoking lease with auth
Currently clients can revoke any lease without permission. This commit
lets etcdserver protect revoking with write permission.

This commit adds a mechanism for generating internal token. It is used
for indicating that LeaseRevoke was issued internally so it should be
able to delete any attached keys.
2017-06-07 17:46:14 -07:00
83b2ea2f60 mvcc: test watch victim/delay path
Current tests don't normally trigger the watch victim path because the
constants are too large; set the constants to small values and hammer
the store to cause watch delivery delays.
2017-06-07 17:02:00 -07:00
0352ce79b8 mvcc: count range/put/del operations for txns
Txns were previously only bumping the txn counter; now bumps all operation
counters.
2017-06-07 16:53:50 -07:00
8d8d1d225a auth: add JWT tests 2017-06-07 16:49:02 -07:00
fe727f3106 auth: reject empty signing method for JWT token provider 2017-06-07 16:49:02 -07:00
a36d62a30c Merge pull request #8049 from heyitsanthony/flock-base-test
fileutil: test some fallback functionality
2017-06-07 16:12:38 -07:00
29911195de Merge pull request #8046 from heyitsanthony/fix-falloc-0
fileutil: return immediately if preallocating 0 bytes
2017-06-07 11:55:27 -07:00
c3fcf0f339 fileutil: test some fallback functionality
syscall.Flock fallback and preallocExtendTrunc
2017-06-07 11:22:40 -07:00
09abea5784 Merge pull request #8047 from heyitsanthony/extra-cov
mvcc, v3rpc: minor coverage improvements
2017-06-07 10:50:30 -07:00
87a3c87e45 fileutil: return immediately if preallocating 0 bytes
fallocate will return EINVAL, causing zeroing to the end of a
0 byte file to fail.

Fixes #8045
2017-06-07 09:57:14 -07:00
fb086ef13f v3rpc: dedup resp.Header == nil checks 2017-06-07 09:25:42 -07:00
fd71da47d1 mvcc: remove unused store.Equals function 2017-06-07 09:25:42 -07:00
4c5f9e0910 Merge pull request #8043 from heyitsanthony/grpc-error
v3rpc: use map for translating errors to grpc errors
2017-06-07 09:13:17 -07:00
e12c7f6dd4 Merge pull request #8042 from heyitsanthony/auth-tests
e2e: add role get and role list e2e tests
2017-06-06 21:51:41 -07:00
8542f2e673 v3rpc: use map for translating errors to grpc errors
Switch statement had poor coverage, use a map instead
2017-06-06 16:55:44 -07:00
d83d7e8262 Merge pull request #8041 from heyitsanthony/fix-test-split
test: fix package splitting when appending REPO_PATH to tests
2017-06-06 16:39:41 -07:00
d8935903a2 e2e: add role get and role list e2e tests
Wasn't being covered
2017-06-06 16:21:00 -07:00
9a367a39d0 test: fix package splitting when appending REPO_PATH to tests 2017-06-06 15:20:39 -07:00
7350525937 Merge pull request #8039 from heyitsanthony/client-example-sort
client: sort nodes in example
2017-06-06 12:29:12 -07:00
0989780a77 Merge pull request #8038 from heyitsanthony/txn-alloc
mvcc: don't use pointer for storeTxnRead in storeTxnWrite
2017-06-06 11:31:42 -07:00
1711fdba32 client: sort nodes in example 2017-06-06 10:56:24 -07:00
f5a5abf8ad Merge pull request #8029 from heyitsanthony/shellcheck
test: shellcheck
2017-06-06 10:35:19 -07:00
402fa8a827 Merge pull request #8034 from heyitsanthony/client-examples
client: add golang examples for KeysAPI
2017-06-06 10:06:40 -07:00
ef63abdf7f mvcc: don't use pointer for storeTxnRead in storeTxnWrite
Saves an allocation when creating a storeTxnWrite.
2017-06-06 09:51:57 -07:00
85f433232a *: clear rarer shellcheck errors on scripts
Clean up the tail of the warnings
2017-06-06 09:36:25 -07:00
17ad275124 travis: add shellcheck 2017-06-06 09:36:25 -07:00
42104fd44b test: shellcheck 2017-06-06 09:36:25 -07:00
2332afe877 Merge pull request #8037 from kragniz/patch-2
doc: python-etcd3 is pretty stable now
2017-06-06 07:49:32 -07:00
e3ff4bf095 doc: python-etcd3 is pretty stable now 2017-06-06 15:45:38 +01:00
1561eb612c client: add golang examples for KeysAPI 2017-06-05 23:05:17 -07:00
8fbf7ce744 Merge pull request #8035 from heyitsanthony/fix-e2e-cov-sig
test, osutil: disable setting SIG_DFL on linux if built with cov tag
2017-06-05 22:48:50 -07:00
88a3bb74b3 test, osutil: disable setting SIG_DFL on linux if built with cov tag
Was causing etcd to terminate before finishing writing its
coverage profile.
2017-06-05 21:09:35 -07:00
f5fc6649fe Merge pull request #8033 from gyuho/grafana
Documentation/op-guide: fix typo in grafana.json
2017-06-05 16:47:59 -07:00
aefd3eb4cf Documentation/op-guide: fix typo in grafana.json
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-05 15:56:16 -07:00
9c2bbc51ca Merge pull request #8027 from connor4312/patch-1
doc: add mixer/etcd3 as a Node.js client integration
2017-06-05 05:35:23 -07:00
3cbbb54927 Merge pull request #8026 from heyitsanthony/document-cn
op-guide: document CN certs in security.md
2017-06-04 18:31:09 -07:00
ace1760628 Merge pull request #8028 from heyitsanthony/govet-more
test: speedup and strengthen go vet checking
2017-06-03 22:29:49 -07:00
887db5a3db *: fix go tool vet -all -shadow errors 2017-06-03 21:32:36 -07:00
9b33aa1967 test: speedup and strengthen go vet checking
Was iterating over every file, reloading everything. Instead,
analyze the package directories. On my machine, the time for
vet checking goes from 34s to 3s. Scans more code too.
2017-06-03 21:31:49 -07:00
591443d838 doc: add mixer/etcd3 as a Node.js client integration 2017-06-03 09:54:03 -07:00
68e0e4abc1 op-guide: document CN certs in security.md 2017-06-02 11:32:12 -07:00
cdb722123a Merge pull request #8024 from heyitsanthony/fix-swagger
scripts, Documentation: fix swagger generation
2017-06-02 11:04:23 -07:00
1be245269e scripts, Documentation: fix swagger generation
Changes to the genproto to support splitting out the grpc-gateway broke
swagger generation.
2017-06-02 10:54:05 -07:00
97519cf79f Merge pull request #8023 from heyitsanthony/protodoc-update
Documentation, scripts: update RPC API docs
2017-06-02 10:26:12 -07:00
156612bb25 Documentation, scripts: regen RPC docs
Was missing the new cancel_reason field. Also includes updated protodoc
sha to fix generating documentation for upcoming txn compare range patchset.
2017-06-02 10:15:12 -07:00
4301f49988 rafthttp: configurable stream reader retry timeout
rafthttp.Transport.DialRetryTimeout field alters the frequency of dial attempts
+ minor changes after code review
2017-06-02 08:53:17 -07:00
c578ac4a1a Merge pull request #8017 from heyitsanthony/doc-gateway-flags
op-guide: document configuration flags for gateway
2017-06-01 15:50:46 -07:00
1cbc7cc274 op-guide: document configuration flags for gateway 2017-06-01 15:46:12 -07:00
f80be42a55 Merge pull request #8012 from heyitsanthony/cov-corruption
test: incrementally merge coverage files
2017-06-01 11:51:49 -07:00
82153e8840 Merge pull request #8015 from heyitsanthony/fix-ctlv2getrole
e2e: make CtlV2GetRoleUser non-quorum
2017-06-01 11:19:24 -07:00
e0653043ff e2e: make CtlV2GetRoleUser non-quorum
GetUser doesn't go through quorum, so issuing a user get to any member
of a cluster may fetch stale data from a slow member. Instead, use a
single member cluster for the test.

Fixes #7993
2017-06-01 10:13:47 -07:00
0c923bdf11 Merge pull request #8010 from heyitsanthony/json-txn
e2e: test txn over grpc json
2017-06-01 10:01:41 -07:00
085bea5c5a Merge pull request #8013 from heyitsanthony/fix-tls-dial
clientv3: use Endpoints[0] to initialize grpc creds
2017-06-01 09:45:44 -07:00
166ae10ca3 integration: use unixs:// if client port configured for tls 2017-05-31 15:51:48 -07:00
ea8561c35c clientv3: support unixs:// scheme
For using TLS without giving a TLSConfig to the client.
2017-05-31 15:51:48 -07:00
1b48d6e5df clientv3/integration: test dialing to TLS without a TLS config times out
etcdctl was getting ctx errors from timing out trying to issue RPCs to
a TLS endpoint but without using TLS for transmission. Client should
immediately bail out with a time out error.
2017-05-31 15:51:03 -07:00
00e581754b test: incrementally merge coverage files
Don't throw away all coverage data if some profiles are corrupted.
2017-05-31 15:46:35 -07:00
8effbda3a7 clientv3: use Endpoints[0] to initialize grpc creds
Dialing out without specifying TLS creds but giving https uses some
default behavior that depends on passing an endpoint with https to
Dial(), so it's not enough to completely rely on the balancer to supply
endpoints.

Fixes #8008

Also ctx-izes grpc.Dial
2017-05-31 15:01:11 -07:00
d8210da505 v3rpc: treat nil txn request op as error
Fixes #7889
2017-05-31 12:39:52 -07:00
1467b456ae dev-guide: add txn json example 2017-05-31 12:08:13 -07:00
85095760ff e2e: test txn over grpc json 2017-05-31 12:08:06 -07:00
f03ed33c87 Merge pull request #7761 from YuleiXiao/xyl_get_transfer_leader_status
return leaderTransferee at raft status
2017-05-31 07:30:49 -07:00
7acd43e8bb Merge pull request #7862 from mitake/benchmark-mvcc-batch
benchmark, pkg: a new option of mvcc --batch for enlarging a single txn
2017-05-30 19:50:44 -07:00
a20e667c5b Merge pull request #7967 from heyitsanthony/purge-snapdb
etcdserver: purge old snap.db files
2017-05-30 16:15:11 -07:00
3748e3cf28 Merge pull request #8006 from heyitsanthony/clientv3-test-nocluster
clientv3: do not launch cluster on go test without explicit -run
2017-05-30 15:33:06 -07:00
119bca6ce7 Merge pull request #8005 from heyitsanthony/more-vendoring
vendor: ghodss/yaml v1.0.0, kr/pty v1.0.0
2017-05-30 14:09:03 -07:00
c250e7be9e clientv3: do not launch cluster on go test without explicit -run
There's a workaround by running -run=Test but this periodically
comes up as an issue, so have `go test` only run Test* to stem
the complaints.

Fixes #8000
2017-05-30 12:23:12 -07:00
0970fe78a0 vendor: ghodss/yaml v1.0.0 2017-05-30 10:33:27 -07:00
5d837e5ab3 vendor: kr/pty v1.0.0 2017-05-30 10:33:25 -07:00
c3879e3776 Merge pull request #8004 from gyuho/doc
Documentation: add 'yaml.NewConfig' change in 3.2
2017-05-30 10:13:57 -07:00
84226a722c Documentation: add 'yaml.NewConfig' change in 3.2
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-30 10:02:48 -07:00
1de75d2035 Merge pull request #7997 from heyitsanthony/version-go-semver
vendor: use v0.2.0 of go-semver
2017-05-30 09:24:54 -07:00
ee45c948ac vendor: use v0.2.0 of go-semver 2017-05-26 16:15:10 -07:00
e42d5174ef Merge pull request #7994 from heyitsanthony/update-perf-doc-3.2
op-guide: update performance.md
2017-05-26 15:00:38 -07:00
e66a1439db op-guide: update performance.md
It's been a year, time to refresh with 3.2.0 data.
2017-05-26 14:11:40 -07:00
6846e49edf Merge pull request #7859 from heyitsanthony/cache-consistent-get
mvcc: cache consistent index
2017-05-26 10:52:53 -07:00
3e1eb1a2e7 Merge pull request #7872 from heyitsanthony/break-boltdb-lock-readtx
backend: don't hold boltdb read txn lock on cursor scanning
2017-05-26 10:25:33 -07:00
ac4855e911 mvcc: benchmark ConsistentIndex 2017-05-26 09:49:40 -07:00
73dee0bec4 mvcc: cache consistentIndex
Called on every entry apply and boltdb requests aren't free.
2017-05-26 09:49:40 -07:00
0506f49f9e backend: don't hold boltdb read txn lock on cursor scanning
Large fetches hold the lock when they do not need to do so.
2017-05-26 09:28:08 -07:00
343a018361 Merge pull request #7900 from heyitsanthony/chunk-restore
mvcc: chunk reads for restoring
2017-05-26 09:21:59 -07:00
57de98f132 Merge pull request #7991 from heyitsanthony/faq-space-exceeded
Documentation: add FAQ entry for "database space exceeded" errors
2017-05-26 09:10:34 -07:00
99366c6b42 benchmark: a new option of mvcc --txn-ops for enlarging a single txn
This commit adds a new option --txn-ops to `benchmark mvcc put`. A
number specified with this option will be used as a number of written
keys in a single transaction. It will be useful for checking the
effect of the batching.
2017-05-26 11:10:24 +09:00
384a84ceee Merge pull request #7990 from heyitsanthony/fix-cov-authfromkeyperm
etcdctl, e2e: use 0xe7cd as argument separator in cov-enabled etcdctl
2017-05-25 18:22:36 -07:00
dac2c10ce9 etcdctl, e2e: use 0xe7cd as argument separator in cov-enabled etcdctl
Fixes #7980
2017-05-25 16:11:52 -07:00
9b6c8d216f Documentation: add FAQ entry for "database space exceeded" errors
Also moves miscategorized cluster id mismatch entry from "performance"
to "operation".
2017-05-25 16:08:58 -07:00
2f84f3d8d8 Merge pull request #7968 from fanminshi/make_maxRequestBytes_configurable
etcd: make max request bytes configurable
2017-05-25 15:54:24 -07:00
212a1efd47 Merge pull request #7965 from heyitsanthony/shared-grpc-conn
embed: share grpc connection for grpc json services
2017-05-25 14:35:33 -07:00
68a72c6b6e v3rpc: change grpc max recv size as needed. 2017-05-25 11:01:51 -07:00
9e7740011b etcdserver: add --max-request-bytes flag 2017-05-25 11:01:38 -07:00
b003734be6 Merge pull request #7976 from fanminshi/make_maxOpsPerTxn_configurable
etcdserver: add --max-txn-ops flag
2017-05-25 10:34:17 -07:00
e9f464debc integration: creation of cluster now takes maxTxnOps 2017-05-24 14:48:44 -07:00
ae7ddfb483 etcdserver: add --max-txn-ops flag
--max-txn-ops allows users to define the maximum transaction operations
for each txn request. it defaults at 128.

Fixes #7826
2017-05-24 10:32:32 -07:00
ab16fa1f07 etcdserver: purge old snap.db files
Lots of garbage db files in #7957. Should purge.
2017-05-22 15:44:21 -07:00
db7ab961bf embed: share grpc connection for grpc json services 2017-05-22 12:59:13 -07:00
44a49ff45a raft: return leaderTransferee at raft status 2017-05-11 12:45:56 +08:00
1aca63e9e0 mvcc: time restore in restore benchmark
This never worked.
2017-05-09 20:14:58 -07:00
163fd2d76b mvcc: chunk reads for restoring
Loading all keys at once would cause etcd to use twice as much
memory than it would need to serve the keys, causing RSS to spike on
boot. Instead, load the keys into the mvcc by chunk. Uses pipelining
for some concurrency.

Fixes #7822
2017-05-09 20:14:58 -07:00
406 changed files with 8393 additions and 17975 deletions

1
.gitignore vendored
View File

@ -1,4 +1,3 @@
/agent-*
/coverage
/gopath
/gopath.proto

View File

@ -1,16 +0,0 @@
#!/usr/bin/env bash
TEST_SUFFIX=$(date +%s | base64 | head -c 15)
TEST_OPTS="RELEASE_TEST=y INTEGRATION=y PASSES='build unit release integration_e2e functional' MANUAL_VER=v3.2.9"
if [ "$TEST_ARCH" == "386" ]; then
TEST_OPTS="GOARCH=386 PASSES='build unit integration_e2e'"
fi
docker run \
--rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd \
gcr.io/etcd-development/etcd-test:go1.8.5 \
/bin/bash -c "${TEST_OPTS} ./test 2>&1 | tee test-${TEST_SUFFIX}.log"
! egrep "(--- FAIL:|panic: test timed out|appears to have leaked|Too many goroutines)" -B50 -A10 test-${TEST_SUFFIX}.log

View File

@ -1,13 +1,11 @@
dist: trusty
language: go
go_import_path: github.com/coreos/etcd
sudo: required
services: docker
sudo: false
go:
- 1.8.5
- tip
- 1.8.3
- tip
notifications:
on_success: never
@ -15,25 +13,19 @@ notifications:
env:
matrix:
- TARGET=amd64
- TARGET=amd64-go-tip
- TARGET=darwin-amd64
- TARGET=windows-amd64
- TARGET=arm64
- TARGET=arm
- TARGET=386
- TARGET=ppc64le
- TARGET=amd64
- TARGET=darwin-amd64
- TARGET=windows-amd64
- TARGET=arm64
- TARGET=arm
- TARGET=386
- TARGET=ppc64le
matrix:
fast_finish: true
allow_failures:
- go: tip
env: TARGET=amd64-go-tip
- go: tip
exclude:
- go: 1.8.5
env: TARGET=amd64-go-tip
- go: tip
env: TARGET=amd64
- go: tip
env: TARGET=darwin-amd64
- go: tip
@ -47,42 +39,45 @@ matrix:
- go: tip
env: TARGET=ppc64le
before_install:
- docker pull gcr.io/etcd-development/etcd-test:go1.8.5
addons:
apt:
sources:
- debian-sid
packages:
- libpcap-dev
- libaspell-dev
- libhunspell-dev
- shellcheck
before_install:
- go get -v -u github.com/chzchzchz/goword
- go get -v -u github.com/coreos/license-bill-of-materials
- go get -v -u honnef.co/go/tools/cmd/gosimple
- go get -v -u honnef.co/go/tools/cmd/unused
- go get -v -u honnef.co/go/tools/cmd/staticcheck
- ./scripts/install-marker.sh amd64
# disable godep restore override
install:
- pushd cmd/etcd && go get -t -v ./... && popd
- pushd cmd/etcd && go get -t -v ./... && popd
script:
- >
case "${TARGET}" in
amd64)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go1.8.5 \
/bin/bash -c "GOARCH=amd64 ./test"
;;
amd64-go-tip)
GOARCH=amd64 ./test
;;
darwin-amd64)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go1.8.5 \
/bin/bash -c "GO_BUILD_FLAGS='-a -v' GOOS=darwin GOARCH=amd64 ./build"
GO_BUILD_FLAGS="-a -v" GOPATH="" GOOS=darwin GOARCH=amd64 ./build
;;
windows-amd64)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go1.8.5 \
/bin/bash -c "GO_BUILD_FLAGS='-a -v' GOOS=windows GOARCH=amd64 ./build"
GO_BUILD_FLAGS="-a -v" GOPATH="" GOOS=windows GOARCH=amd64 ./build
;;
386)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go1.8.5 \
/bin/bash -c "GOARCH=386 PASSES='build unit' ./test"
GOARCH=386 PASSES="build unit" ./test
;;
*)
# test building out of gopath
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go1.8.5 \
/bin/bash -c "GO_BUILD_FLAGS='-a -v' GOARCH='${TARGET}' ./build"
GO_BUILD_FLAGS="-a -v" GOPATH="" GOARCH="${TARGET}" ./build
;;
esac

63
CODE_OF_CONDUCT.md Normal file
View File

@ -0,0 +1,63 @@
## CoreOS Community Code of Conduct
### Contributor Code of Conduct
As contributors and maintainers of this project, and in the interest of
fostering an open and welcoming community, we pledge to respect all people who
contribute through reporting issues, posting feature requests, updating
documentation, submitting pull requests or patches, and other activities.
We are committed to making participation in this project a harassment-free
experience for everyone, regardless of level of experience, gender, gender
identity and expression, sexual orientation, disability, personal appearance,
body size, race, ethnicity, age, religion, or nationality.
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery
* Personal attacks
* Trolling or insulting/derogatory comments
* Public or private harassment
* Publishing others' private information, such as physical or electronic addresses, without explicit permission
* Other unethical or unprofessional conduct.
Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct. By adopting this Code of Conduct,
project maintainers commit themselves to fairly and consistently applying these
principles to every aspect of managing this project. Project maintainers who do
not follow or enforce the Code of Conduct may be permanently removed from the
project team.
This code of conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community.
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting a project maintainer, Brandon Philips
<brandon.philips@coreos.com>, and/or Meghan Schofield
<meghan.schofield@coreos.com>.
This Code of Conduct is adapted from the Contributor Covenant
(http://contributor-covenant.org), version 1.2.0, available at
http://contributor-covenant.org/version/1/2/0/
### CoreOS Events Code of Conduct
CoreOS events are working conferences intended for professional networking and
collaboration in the CoreOS community. Attendees are expected to behave
according to professional standards and in accordance with their employers
policies on appropriate workplace behavior.
While at CoreOS events or related social networking opportunities, attendees
should not engage in discriminatory or offensive speech or actions including
but not limited to gender, sexuality, race, age, disability, or religion.
Speakers should be especially aware of these concerns.
CoreOS does not condone any statements by speakers contrary to these standards.
CoreOS reserves the right to deny entrance and/or eject from an event (without
refund) any individual found to be engaging in discriminatory or offensive
speech or actions.
Please bring any concerns to the immediate attention of designated on-site
staff, Brandon Philips <brandon.philips@coreos.com>, and/or Meghan Schofield
<meghan.schofield@coreos.com>.

View File

@ -1,57 +0,0 @@
FROM ubuntu:16.10
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
RUN apt-get -y update \
&& apt-get -y install \
build-essential \
gcc \
apt-utils \
pkg-config \
software-properties-common \
apt-transport-https \
libssl-dev \
sudo \
bash \
curl \
wget \
tar \
git \
netcat \
libaspell-dev \
libhunspell-dev \
hunspell-en-us \
aspell-en \
shellcheck \
&& apt-get -y update \
&& apt-get -y upgrade \
&& apt-get -y autoremove \
&& apt-get -y autoclean
ENV GOROOT /usr/local/go
ENV GOPATH /go
ENV PATH ${GOPATH}/bin:${GOROOT}/bin:${PATH}
ENV GO_VERSION REPLACE_ME_GO_VERSION
ENV GO_DOWNLOAD_URL https://storage.googleapis.com/golang
RUN rm -rf ${GOROOT} \
&& curl -s ${GO_DOWNLOAD_URL}/go${GO_VERSION}.linux-amd64.tar.gz | tar -v -C /usr/local/ -xz \
&& mkdir -p ${GOPATH}/src ${GOPATH}/bin \
&& go version
RUN mkdir -p ${GOPATH}/src/github.com/coreos/etcd
WORKDIR ${GOPATH}/src/github.com/coreos/etcd
ADD ./scripts/install-marker.sh /tmp/install-marker.sh
RUN go get -v -u -tags spell github.com/chzchzchz/goword \
&& go get -v -u github.com/coreos/license-bill-of-materials \
&& go get -v -u honnef.co/go/tools/cmd/gosimple \
&& go get -v -u honnef.co/go/tools/cmd/unused \
&& go get -v -u honnef.co/go/tools/cmd/staticcheck \
&& go get -v -u github.com/wadey/gocovmerge \
&& go get -v -u github.com/gordonklaus/ineffassign \
&& /tmp/install-marker.sh amd64 \
&& rm -f /tmp/install-marker.sh \
&& curl -s https://codecov.io/bash >/codecov \
&& chmod 700 /codecov

View File

@ -24,11 +24,6 @@ curl -L http://localhost:2379/v3alpha/kv/put \
curl -L http://localhost:2379/v3alpha/kv/range \
-X POST -d '{"key": "Zm9v"}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"3"},"kvs":[{"key":"Zm9v","create_revision":"2","mod_revision":"2","version":"1","value":"YmFy"}],"count":"1"}
# get all keys prefixed with "foo"
curl -L http://localhost:2379/v3alpha/kv/range \
-X POST -d '{"key": "Zm9v", "range_end": "Zm9w"}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"3"},"kvs":[{"key":"Zm9v","create_revision":"2","mod_revision":"2","version":"1","value":"YmFy"}],"count":"1"}
```
Use `curl` to watch a key:

File diff suppressed because it is too large Load Diff

View File

@ -4,4 +4,8 @@ For the most part, the etcd project is stable, but we are still moving fast! We
## The current experimental API/features are:
(none currently)
- [gateway][gateway]: beta, to be stable in 3.2 release
- [gRPC proxy][grpc-proxy]: alpha, to be stable in 3.2 release
[gateway]: ../op-guide/gateway.md
[grpc-proxy]: ../op-guide/grpc_proxy.md

View File

@ -2,7 +2,7 @@
## System requirements
The etcd performance benchmarks run etcd on 8 vCPU, 16GB RAM, 50GB SSD GCE instances, but any relatively modern machine with low latency storage and a few gigabytes of memory should suffice for most use cases. Applications with large v2 data stores will require more memory than a large v3 data store since data is kept in anonymous memory instead of memory mapped from a file. For running etcd on a cloud provider, we suggest at least a medium instance on AWS or a standard-1 instance on GCE.
The etcd performance benchmarks run etcd on 8 vCPU, 16GB RAM, 50GB SSD GCE instances, but any relatively modern machine with low latency storage and a few gigabytes of memory should suffice for most use cases. Applications with large v2 data stores will require more memory than a large v3 data store since data is kept in anonymous memory instead of memory mapped from a file. For running etcd on a cloud provider, see the [Example hardware configuration][example-hardware-configurations] documentation.
## Download the pre-built binary
@ -18,7 +18,6 @@ To build `etcd` from the `master` branch without a `GOPATH` using the official `
$ git clone https://github.com/coreos/etcd.git
$ cd etcd
$ ./build
$ ./bin/etcd
```
To build a vendored `etcd` from the `master` branch via `go get`:
@ -28,7 +27,6 @@ To build a vendored `etcd` from the `master` branch via `go get`:
$ echo $GOPATH
/Users/example/go
$ go get github.com/coreos/etcd/cmd/etcd
$ $GOPATH/bin/etcd
```
To build `etcd` from the `master` branch without vendoring (may not build due to upstream conflicts):
@ -38,20 +36,28 @@ To build `etcd` from the `master` branch without vendoring (may not build due to
$ echo $GOPATH
/Users/example/go
$ go get github.com/coreos/etcd
$ $GOPATH/bin/etcd
```
## Test the installation
Check the etcd binary is built correctly by starting etcd and setting a key.
Start etcd:
### Starting etcd
If etcd is built without using GOPATH, run the following:
```
$ ./bin/etcd
```
If etcd is built using GOPATH, run the following:
Set a key:
```
$ $GOPATH/bin/etcd
```
### Setting a key
Run the following:
```
$ ETCDCTL_API=3 ./bin/etcdctl put foo bar
@ -64,4 +70,4 @@ If OK is printed, then etcd is working!
[go]: https://golang.org/doc/install
[build-script]: ../build
[cmd-directory]: ../cmd
[example-hardware-configurations]: op-guide/hardware.md#example-hardware-configurations

View File

@ -47,19 +47,12 @@ Administrators who need to create reliable and scalable key-value stores for the
- [Amazon Web Services][aws_platform]
- [FreeBSD][freebsd_platform]
### Security
### Upgrading and compatibility
- [TLS][security]
- [Role-based access control][authentication]
### Maintenance and troubleshooting
- [Frequently asked questions][common questions]
- [Monitoring][monitoring]
- [Maintenance][maintenance]
- [Failure modes][failures]
- [Disaster recovery][recovery]
- [Upgrading][upgrading]
- [Migrate applications from using API v2 to API v3][v2_migration]
- [Upgrading a v2.3 cluster to v3.0][v3_upgrade]
- [Upgrading a v3.0 cluster to v3.1][v31_upgrade]
- [Upgrading a v3.1 cluster to v3.2][v32_upgrade]
## Learning
@ -113,6 +106,8 @@ Answers to [common questions] about etcd.
[freebsd_platform]: platforms/freebsd.md
[aws_platform]: platforms/aws.md
[experimental]: dev-guide/experimental_apis.md
[v3_upgrade]: upgrades/upgrade_3_0.md
[v31_upgrade]: upgrades/upgrade_3_1.md
[v32_upgrade]: upgrades/upgrade_3_2.md
[authentication]: op-guide/authentication.md
[auth_design]: learning/auth_design.md
[upgrading]: upgrades/upgrading-etcd.md

View File

@ -8,11 +8,15 @@
### Configuration
#### What is the difference between listen-<client,peer>-urls, advertise-client-urls or initial-advertise-peer-urls?
#### What is the difference between advertise-urls and listen-urls?
`listen-client-urls` and `listen-peer-urls` specify the local addresses etcd server binds to for accepting incoming connections. To listen on a port for all interfaces, specify `0.0.0.0` as the listen IP address.
`listen-urls` specifies the local addresses etcd server binds to for accepting incoming connections. To listen on a port for all interfaces, specify `0.0.0.0` as the listen IP address.
`advertise-client-urls` and `initial-advertise-peer-urls` specify the addresses etcd clients or other etcd members should use to contact the etcd server. The advertise addresses must be reachable from the remote machines. Do not advertise addresses like `localhost` or `0.0.0.0` for a production setup since these addresses are unreachable from remote machines.
`advertise-urls` specifies the addresses etcd clients or other etcd members should use to contact the etcd server. The advertise addresses must be reachable from the remote machines. Do not advertise addresses like `localhost` or `0.0.0.0` for a production setup since these addresses are unreachable from remote machines.
#### Why doesn't changing `--listen-peer-urls` or `--initial-advertise-peer-urls` update the advertised peer URLs in `etcdctl member list`?
A member's advertised peer URLs come from `--initial-advertise-peer-urls` on initial cluster boot. Changing the listen peer URLs or the initial advertise peers after booting the member won't affect the exported advertise peer URLs since changes must go through quorum to avoid membership configuration split brain. Use `etcdctl member update` to update a member's peer URLs.
### Deployment
@ -107,7 +111,7 @@ Try the [benchmark] tool. Current [benchmark results][benchmark-result] are avai
#### What does the etcd warning "apply entries took too long" mean?
After a majority of etcd members agree to commit a request, each etcd server applies the request to its data store and persists the result to disk. Even with a slow mechanical disk or a virtualized network disk, such as Amazons EBS or Googles PD, applying a request should normally take fewer than 50 milliseconds. If the average apply duration exceeds 100 milliseconds, etcd will warn that entries are taking too long to apply.
Usually this issue is caused by a slow disk. The disk could be experiencing contention among etcd and other applications, or the disk is too simply slow (e.g., a shared virtualized disk). To rule out a slow disk from causing this warning, monitor [backend_commit_duration_seconds][backend_commit_metrics] (p99 duration should be less than 25ms) to confirm the disk is reasonably fast. If the disk is too slow, assigning a dedicated disk to etcd or using faster disk will typically solve the problem.
The second most common cause is CPU starvation. If monitoring of the machines CPU usage shows heavy utilization, there may not be enough compute capacity for etcd. Moving etcd to dedicated machine, increasing process resource isolation cgroups, or renicing the etcd server process into a higher priority can usually solve the problem.

View File

@ -40,7 +40,7 @@
**Python libraries**
- [kragniz/python-etcd3](https://github.com/kragniz/python-etcd3) - Work in progress client for v3
- [kragniz/python-etcd3](https://github.com/kragniz/python-etcd3) - Client for v3
- [jplana/python-etcd](https://github.com/jplana/python-etcd) - Supports v2
- [russellhaering/txetcd](https://github.com/russellhaering/txetcd) - a Twisted Python library
- [cholcombe973/autodock](https://github.com/cholcombe973/autodock) - A docker deployment automation tool
@ -50,6 +50,7 @@
**Node libraries**
- [mixer/etcd3](https://github.com/mixer/etcd3) - Supports v3
- [stianeikeland/node-etcd](https://github.com/stianeikeland/node-etcd) - Supports v2 (w Coffeescript)
- [lavagetto/nodejs-etcd](https://github.com/lavagetto/nodejs-etcd) - Supports v2
- [deedubs/node-etcd-config](https://github.com/deedubs/node-etcd-config) - Supports v2

View File

@ -449,7 +449,7 @@ message LeaseRevokeRequest {
### Keep alives
Leases are refreshed using a bi-directional stream created with the `LeaseKeepAlive` API call. When the client wishes to refresh a lease, it sends a `LeaseKeepAliveRequest` over the stream:
Leases are refreshed using a bi-directional stream created with the `LeaseKeepAlive` API call. When the client wishes to refresh a lease, it sends a `LeaseGrantRequest` over the stream:
```protobuf
message LeaseKeepAliveRequest {

View File

@ -1,17 +1,17 @@
# etcd versus other key-value stores
# Why etcd
The name "etcd" originated from two ideas, the unix "/etc" folder and "d"istibuted systems. The "/etc" folder is a place to store configuration data for a single system whereas etcd stores configuration information for large scale distributed systems. Hence, a "d"istributed "/etc" is "etcd".
etcd is designed as a general substrate for large scale distributed systems. These are systems that will never tolerate split-brain operation and are willing to sacrifice availability to achieve this end. etcd stores metadata in a consistent and fault-tolerant way. An etcd cluster is meant to provide key-value storage with best of class stability, reliability, scalability and performance.
Distributed systems use etcd as a consistent key-value store for configuration management, service discovery, and coordinating distributed work. Many [organizations][production-users] use etcd to implement production systems such as container schedulers, service discovery services, and distributed data storage. Common distributed patterns using etcd include [leader election][etcd-etcdctl-elect], [distributed locks][etcd-etcdctl-lock], and monitoring machine liveness.
etcd stores metadata in a consistent and fault-tolerant way. Distributed systems use etcd as a consistent key-value store for configuration management, service discovery, and coordinating distributed work. Common distributed patterns using etcd include [leader election][etcd-etcdctl-elect], [distributed locks][etcd-etcdctl-lock], and monitoring machine liveness.
## Use cases
- Container Linux by CoreOS: Applications running on [Container Linux][container-linux] get automatic, zero-downtime Linux kernel updates. Container Linux uses [locksmith] to coordinate updates. Locksmith implements a distributed semaphore over etcd to ensure only a subset of a cluster is rebooting at any given time.
- Container Linux by CoreOS: Application running on [Container Linux][container-linux] gets automatic, zero-downtime Linux kernel updates. Container Linux uses [locksmith] to coordinate updates. locksmith implements a distributed semaphore over etcd to ensure only a subset of a cluster is rebooting at any given time.
- [Kubernetes][kubernetes] stores configuration data into etcd for service discovery and cluster management; etcd's consistency is crucial for correctly scheduling and operating services. The Kubernetes API server persists cluster state into etcd. It uses etcd's watch API to monitor the cluster and roll out critical configuration changes.
## Comparison chart
## etcd versus other key-value stores
When deciding whether to use etcd as a key-value store, its worth keeping in mind etcds main goal. Namely, etcd is designed as a general substrate for large scale distributed systems. These are systems that will never tolerate split-brain operation and are willing to sacrifice availability to achieve this end. An etcd cluster is meant to provide consistent key-value storage with best of class stability, reliability, scalability and performance. The upshot of this focus is many [organizations][production-users] already use etcd to implement production systems such as container schedulers, service discovery services, distributed data storage, and more.
Perhaps etcd already seems like a good fit, but as with all technological decisions, proceed with caution. Please note this documentation is written by the etcd team. Although the ideal is a disinterested comparison of technology and features, the authors expertise and biases obviously favor etcd. Use only as directed.
@ -84,7 +84,7 @@ For distributed coordination, choosing etcd can help prevent operational headach
[tidb]: https://github.com/pingcap/tidb
[etcd-v3lock]: https://godoc.org/github.com/coreos/etcd/etcdserver/api/v3lock/v3lockpb
[etcd-v3election]: https://godoc.org/github.com/coreos/etcd/etcdserver/api/v3election/v3electionpb
[etcd-etcdctl-lock]: ../../etcdctl/README.md#lock-lockname-command-arg1-arg2-
[etcd-etcdctl-lock]: ../../etcdctl/README.md#lock-lockname
[etcd-etcdctl-elect]: ../../etcdctl/README.md#elect-options-election-name-proposal
[etcd-mvcc]: data_model.md
[etcd-recipe]: https://godoc.org/github.com/coreos/etcd/contrib/recipes
@ -113,3 +113,4 @@ For distributed coordination, choosing etcd can help prevent operational headach
[container-linux]: https://coreos.com/why
[locksmith]: https://github.com/coreos/locksmith
[kubernetes]: http://kubernetes.io/docs/whatisk8s

View File

@ -281,7 +281,7 @@ ETCD_DISCOVERY=https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573d
--discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
**Each member must have a different name flag specified or else discovery will fail due to duplicated names. `Hostname` or `machine-id` can be a good choice. **
**Each member must have a different name flag specified or else discovery will fail due to duplicated names. `Hostname` or `machine-id` can be a good choice.**
Now we start etcd with those relevant flags for each member:

View File

@ -76,7 +76,7 @@ LABELS {
}
ANNOTATIONS {
summary = "slow gRPC requests",
description = "on etcd instance {{ $labels.instance }} gRPC requests to {{ $label.grpc_method }} are slow",
description = "on etcd instance {{ $labels.instance }} gRPC requests to {{ $labels.grpc_method }} are slow",
}
# HTTP requests alerts
@ -117,7 +117,7 @@ LABELS {
}
ANNOTATIONS {
summary = "slow HTTP requests",
description = "on etcd instance {{ $labels.instance }} HTTP requests to {{ $label.method }} are slow",
description = "on etcd instance {{ $labels.instance }} HTTP requests to {{ $labels.method }} are slow",
}
# file descriptor alerts
@ -161,7 +161,7 @@ LABELS {
}
ANNOTATIONS {
summary = "etcd member communication is slow",
description = "etcd instance {{ $labels.instance }} member communication with {{ $label.To }} is slow",
description = "etcd instance {{ $labels.instance }} member communication with {{ $labels.To }} is slow",
}
# etcd proposal alerts

View File

@ -1,49 +1,6 @@
# Monitoring etcd
Each etcd server provides local monitoring information on its client port through http endpoints. The monitoring data is useful for both system health checking and cluster debugging.
## Debug endpoint
If `--debug` is set, the etcd server exports debugging information on its client port under the `/debug` path. Take care when setting `--debug`, since there will be degraded performance and verbose logging.
The `/debug/pprof` endpoint is the standard go runtime profiling endpoint. This can be used to profile CPU, heap, mutex, and goroutine utilization. For example, here `go tool pprof` gets the top 10 functions where etcd spends its time:
```sh
$ go tool pprof http://localhost:2379/debug/pprof/profile
Fetching profile from http://localhost:2379/debug/pprof/profile
Please wait... (30s)
Saved profile in /home/etcd/pprof/pprof.etcd.localhost:2379.samples.cpu.001.pb.gz
Entering interactive mode (type "help" for commands)
(pprof) top10
310ms of 480ms total (64.58%)
Showing top 10 nodes out of 157 (cum >= 10ms)
flat flat% sum% cum cum%
130ms 27.08% 27.08% 130ms 27.08% runtime.futex
70ms 14.58% 41.67% 70ms 14.58% syscall.Syscall
20ms 4.17% 45.83% 20ms 4.17% github.com/coreos/etcd/cmd/vendor/golang.org/x/net/http2/hpack.huffmanDecode
20ms 4.17% 50.00% 30ms 6.25% runtime.pcvalue
20ms 4.17% 54.17% 50ms 10.42% runtime.schedule
10ms 2.08% 56.25% 10ms 2.08% github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).AuthInfoFromCtx
10ms 2.08% 58.33% 10ms 2.08% github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).Lead
10ms 2.08% 60.42% 10ms 2.08% github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/wait.(*timeList).Trigger
10ms 2.08% 62.50% 10ms 2.08% github.com/coreos/etcd/cmd/vendor/github.com/prometheus/client_golang/prometheus.(*MetricVec).hashLabelValues
10ms 2.08% 64.58% 10ms 2.08% github.com/coreos/etcd/cmd/vendor/golang.org/x/net/http2.(*Framer).WriteHeaders
```
The `/debug/requests` endpoint gives gRPC traces and performance statistics through a web browser. For example, here is a `Range` request for the key `abc`:
```
When Elapsed (s)
2017/08/18 17:34:51.999317 0.000244 /etcdserverpb.KV/Range
17:34:51.999382 . 65 ... RPC: from 127.0.0.1:47204 deadline:4.999377747s
17:34:51.999395 . 13 ... recv: key:"abc"
17:34:51.999499 . 104 ... OK
17:34:51.999535 . 36 ... sent: header:<cluster_id:14841639068965178418 member_id:10276657743932975437 revision:15 raft_term:17 > kvs:<key:"abc" create_revision:6 mod_revision:14 version:9 value:"asda" > count:1
```
## Metrics endpoint
Each etcd server exports metrics under the `/metrics` path on its client port and optionally on interfaces given by `--listen-metrics-urls`.
Each etcd server exports metrics under the `/metrics` path on its client port.
The metrics can be fetched with `curl`:
@ -118,6 +75,8 @@ Access: proxy
Then import the default [etcd dashboard template][template] and customize. For instance, if Prometheus data source name is `my-etcd`, the `datasource` field values in JSON also need to be `my-etcd`.
See the [demo][demo].
Sample dashboard:
![](./etcd-sample-grafana.png)
@ -126,3 +85,4 @@ Sample dashboard:
[prometheus]: https://prometheus.io/
[grafana]: http://grafana.org/
[template]: ./grafana.json
[demo]: http://dash.etcd.io/dashboard/db/test-etcd

View File

@ -6,7 +6,7 @@ This guide assumes operational knowledge of Amazon Web Services (AWS), specifica
As a critical building block for distributed systems it is crucial to perform adequate capacity planning in order to support the intended cluster workload. As a highly available and strongly consistent data store increasing the number of nodes in an etcd cluster will generally affect performance adversely. This makes sense intuitively, as more nodes means more members for the leader to coordinate state across. The most direct way to increase throughput and decrease latency of an etcd cluster is allocate more disk I/O, network I/O, CPU, and memory to cluster members. In the event it is impossible to temporarily divert incoming requests to the cluster, scaling the EC2 instances which comprise the etcd cluster members one at a time may improve performance. It is, however, best to avoid bottlenecks through capacity planning.
The etcd team has produced a [hardware recommendation guide](../op-guide/hardware.md) which is very useful for “ballparking” how many nodes and what instance type are necessary for a cluster.
The etcd team has produced a [hardware recommendation guide]( ../op-guide/hardware.md) which is very useful for “ballparking” how many nodes and what instance type are necessary for a cluster.
AWS provides a service for creating groups of EC2 instances which are dynamically sized to match load on the instances. Using an Auto Scaling Group ([ASG](http://docs.aws.amazon.com/autoscaling/latest/userguide/AutoScalingGroup.html)) to dynamically scale an etcd cluster is not recommended for several reasons including:

View File

@ -1,19 +0,0 @@
# Upgrading etcd clusters and applications
This section contains documents specific to upgrading etcd clusters and applications.
## Moving from etcd API v2 to API v3
* [Migrate applications from using API v2 to API v3][migrate-apps]
## Upgrading an etcd v3.x cluster
* [Upgrade etcd from 3.0 to 3.1][upgrade-3-1]
* [Upgrade etcd from 3.1 to 3.2][upgrade-3-2]
## Upgrading from etcd v2.3
* [Upgrade a v2.3 cluster to v3.0][upgrade-cluster]
[migrate-apps]: ../op-guide/v2-migration.md
[upgrade-cluster]: upgrade_3_0.md
[upgrade-3-1]: upgrade_3_1.md
[upgrade-3-2]: upgrade_3_2.md

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Snapshot Migration
You can migrate a snapshot of your data from a v0.4.9+ cluster into a new etcd 2.2 cluster using a snapshot migration. After snapshot migration, the etcd indexes of your data will change. Many etcd applications rely on these indexes to behave correctly. This operation should only be done while all etcd applications are stopped.

View File

@ -1,85 +1,165 @@
# Documentation
# etcd2
etcd is a distributed key-value store designed to reliably and quickly preserve and provide access to critical data. It enables reliable distributed coordination through distributed locking, leader elections, and write barriers. An etcd cluster is intended for high availability and permanent data storage and retrieval.
[![Go Report Card](https://goreportcard.com/badge/github.com/coreos/etcd)](https://goreportcard.com/report/github.com/coreos/etcd)
[![Build Status](https://travis-ci.org/coreos/etcd.svg?branch=master)](https://travis-ci.org/coreos/etcd)
[![Build Status](https://semaphoreci.com/api/v1/coreos/etcd/branches/master/shields_badge.svg)](https://semaphoreci.com/coreos/etcd)
[![Docker Repository on Quay.io](https://quay.io/repository/coreos/etcd-git/status "Docker Repository on Quay.io")](https://quay.io/repository/coreos/etcd-git)
This is the etcd v2 documentation set. For more recent versions, please see the [etcd v3 guides][etcd-v3].
**Note**: The `master` branch may be in an *unstable or even broken state* during development. Please use [releases][github-release] instead of the `master` branch in order to get stable binaries.
## Communicating with etcd v2
![etcd Logo](../../logos/etcd-horizontal-color.png)
Reading and writing into the etcd keyspace is done via a simple, RESTful HTTP API, or using language-specific libraries that wrap the HTTP API with higher level primitives.
etcd is a distributed, consistent key-value store for shared configuration and service discovery, with a focus on being:
### Reading and Writing
* *Simple*: curl'able user-facing API (HTTP+JSON)
* *Secure*: optional SSL client cert authentication
* *Fast*: benchmarked 1000s of writes/s per instance
* *Reliable*: properly distributed using Raft
- [Client API Documentation][api]
- [Libraries, Tools, and Language Bindings][libraries]
- [Admin API Documentation][admin-api]
- [Members API][members-api]
etcd is written in Go and uses the [Raft][raft] consensus algorithm to manage a highly-available replicated log.
### Security, Auth, Access control
etcd is used [in production by many companies](./production-users.md), and the development team stands behind it in critical deployment scenarios, where etcd is frequently teamed with applications such as [Kubernetes][k8s], [fleet][fleet], [locksmith][locksmith], [vulcand][vulcand], and many others.
- [Security Model][security]
- [Auth and Security][auth_api]
- [Authentication Guide][authentication]
See [etcdctl][etcdctl] for a simple command line client.
Or feel free to just use `curl`, as in the examples below.
## etcd v2 Cluster Administration
[raft]: https://raft.github.io/
[k8s]: http://kubernetes.io/
[fleet]: https://github.com/coreos/fleet
[locksmith]: https://github.com/coreos/locksmith
[vulcand]: https://github.com/vulcand/vulcand
[etcdctl]: https://github.com/coreos/etcd/tree/master/etcdctl
Configuration values are distributed within the cluster for your applications to read. Values can be changed programmatically and smart applications can reconfigure automatically. You'll never again have to run a configuration management tool on every machine in order to change a single config value.
## Getting Started
### General Info
### Getting etcd
- [etcd Proxies][proxy]
- [Production Users][production-users]
- [Admin Guide][admin_guide]
- [Configuration Flags][configuration]
- [Frequently Asked Questions][faq]
The easiest way to get etcd is to use one of the pre-built release binaries which are available for OSX, Linux, Windows, AppC (ACI), and Docker. Instructions for using these binaries are on the [GitHub releases page][github-release].
### Initial Setup
For those wanting to try the very latest version, you can build the latest version of etcd from the `master` branch.
You will first need [*Go*](https://golang.org/) installed on your machine (version 1.5+ is required).
All development occurs on `master`, including new features and bug fixes.
Bug fixes are first targeted at `master` and subsequently ported to release branches, as described in the [branch management][branch-management] guide.
- [Tuning etcd Clusters][tuning]
- [Discovery Service Protocol][discovery_protocol]
- [Running etcd under Docker][docker_guide]
[github-release]: https://github.com/coreos/etcd/releases/
[branch-management]: branch_management.md
### Live Reconfiguration
### Running etcd
- [Runtime Configuration][runtime-configuration]
First start a single-member cluster of etcd:
### Debugging etcd
```sh
./bin/etcd
```
- [Metrics Collection][metrics]
- [Error Code][errorcode]
- [Reporting Bugs][reporting_bugs]
This will bring up etcd listening on port 2379 for client communication and on port 2380 for server-to-server communication.
### Migration
Next, let's set a single key, and then retrieve it:
- [Upgrade etcd to 2.3][upgrade_2_3]
- [Upgrade etcd to 2.2][upgrade_2_2]
- [Upgrade to etcd 2.1][upgrade_2_1]
- [Snapshot Migration (0.4.x to 2.x)][04_to_2_snapshot_migration]
- [Backward Compatibility][backward_compatibility]
```
curl -L http://127.0.0.1:2379/v2/keys/mykey -XPUT -d value="this is awesome"
curl -L http://127.0.0.1:2379/v2/keys/mykey
```
You have successfully started an etcd and written a key to the store.
[etcd-v3]: ../docs.md
[api]: api.md
[libraries]: libraries-and-tools.md
[admin-api]: other_apis.md
[members-api]: members_api.md
[security]: security.md
[auth_api]: auth_api.md
[authentication]: authentication.md
[proxy]: proxy.md
[production-users]: production-users.md
[admin_guide]: admin_guide.md
[configuration]: configuration.md
[faq]: faq.md
[tuning]: tuning.md
[discovery_protocol]: discovery_protocol.md
[docker_guide]: docker_guide.md
[runtime-configuration]: runtime-configuration.md
[metrics]: metrics.md
[errorcode]: errorcode.md
[reporting_bugs]: reporting_bugs.md
[upgrade_2_3]: upgrade_2_3.md
[upgrade_2_2]: upgrade_2_2.md
[upgrade_2_1]: upgrade_2_1.md
[04_to_2_snapshot_migration]: 04_to_2_snapshot_migration.md
[backward_compatibility]: backward_compatibility.md
### etcd TCP ports
The [official etcd ports][iana-ports] are 2379 for client requests, and 2380 for peer communication. To maintain compatibility, some etcd configuration and documentation continues to refer to the legacy ports 4001 and 7001, but all new etcd use and discussion should adopt the IANA-assigned ports. The legacy ports 4001 and 7001 will be fully deprecated, and support for their use removed, in future etcd releases.
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
### Running local etcd cluster
First install [goreman](https://github.com/mattn/goreman), which manages Procfile-based applications.
Our [Procfile script](../../V2Procfile) will set up a local example cluster. You can start it with:
```sh
goreman start
```
This will bring up 3 etcd members `infra1`, `infra2` and `infra3` and etcd proxy `proxy`, which runs locally and composes a cluster.
You can write a key to the cluster and retrieve the value back from any member or proxy.
### Next Steps
Now it's time to dig into the full etcd API and other guides.
- Explore the full [API][api].
- Set up a [multi-machine cluster][clustering].
- Learn the [config format, env variables and flags][configuration].
- Find [language bindings and tools][libraries-and-tools].
- Use TLS to [secure an etcd cluster][security].
- [Tune etcd][tuning].
- [Upgrade from 0.4.9+ to 2.2.0][upgrade].
[api]: ./api.md
[clustering]: ./clustering.md
[configuration]: ./configuration.md
[libraries-and-tools]: ./libraries-and-tools.md
[security]: ./security.md
[tuning]: ./tuning.md
[upgrade]: ./04_to_2_snapshot_migration.md
## Contact
- Mailing list: [etcd-dev](https://groups.google.com/forum/?hl=en#!forum/etcd-dev)
- IRC: #[etcd](irc://irc.freenode.org:6667/#etcd) on freenode.org
- Planning/Roadmap: [milestones](https://github.com/coreos/etcd/milestones), [roadmap](../../ROADMAP.md)
- Bugs: [issues](https://github.com/coreos/etcd/issues)
## Contributing
See [CONTRIBUTING](../../CONTRIBUTING.md) for details on submitting patches and the contribution workflow.
## Reporting bugs
See [reporting bugs](reporting_bugs.md) for details about reporting any issue you may encounter.
## Known bugs
[GH518](https://github.com/coreos/etcd/issues/518) is a known bug. Issue is that:
```
curl http://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar
curl http://127.0.0.1:2379/v2/keys/foo -XPUT -d dir=true -d prevExist=true
```
If the previous node is a key and client tries to overwrite it with `dir=true`, it does not give warnings such as `Not a directory`. Instead, the key is set to empty value.
## Project Details
### Versioning
#### Service Versioning
etcd uses [semantic versioning](http://semver.org)
New minor versions may add additional features to the API.
You can get the version of etcd by issuing a request to /version:
```sh
curl -L http://127.0.0.1:2379/version
```
#### API Versioning
The `v2` API responses should not change after the 2.0.0 release but new features will be added over time.
#### 32-bit and other unsupported systems
etcd has known issues on 32-bit systems due to a bug in the Go runtime. See #[358][358] for more information.
To avoid inadvertently running a possibly unstable etcd server, `etcd` on unsupported architectures will print
a warning message and immediately exit if the environment variable `ETCD_UNSUPPORTED_ARCH` is not set to
the target architecture.
Currently only the amd64 architecture is officially supported by `etcd`.
[358]: https://github.com/coreos/etcd/issues/358
### License
etcd is under the Apache 2.0 license. See the [LICENSE](../../LICENSE) file for details.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Administration
## Data Directory
@ -13,7 +8,7 @@ When first started, etcd stores its configuration into a data directory specifie
Configuration is stored in the write ahead log and includes: the local member ID, cluster ID, and initial cluster configuration.
The write ahead log and snapshot files are used during member operation and to recover after a restart.
Having a dedicated disk to store wal files can improve the throughput and stabilize the cluster.
Having a dedicated disk to store wal files can improve the throughput and stabilize the cluster.
It is highly recommended to dedicate a wal disk and set `--wal-dir` to point to a directory on that device for a production cluster deployment.
If a members data directory is ever lost or corrupted then the user should [remove][remove-a-member] the etcd member from the cluster using `etcdctl` tool.
@ -56,7 +51,7 @@ $ curl -L http://127.0.0.1:2379/health
You can also use etcdctl to check the cluster-wide health information. It will contact all the members of the cluster and collect the health information for you.
```
$./etcdctl cluster-health
$./etcdctl cluster-health
member 8211f1d0f64f3269 is healthy: got healthy result from http://127.0.0.1:12379
member 91bc3c398fb3c146 is healthy: got healthy result from http://127.0.0.1:22379
member fd422379fda50e48 is healthy: got healthy result from http://127.0.0.1:32379

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# etcd API
## Running a Single Machine Cluster
@ -323,7 +318,7 @@ The first terminal should get the notification and return with the same response
However, the watch command can do more than this.
Using the index, we can watch for commands that have happened in the past.
This is useful for ensuring you don't miss events between watch commands.
This is useful for ensuring you don't miss events between watch commands.
Typically, we watch again from the `modifiedIndex` + 1 of the node we got.
Let's try to watch for the set command of index 7 again:
@ -343,13 +338,13 @@ curl 'http://127.0.0.1:2379/v2/keys/foo?wait=true&waitIndex=8'
Then even if etcd is on index 9 or 800, the first event to occur to the `/foo`
key between 8 and the current index will be returned.
**Note**: etcd only keeps the responses of the most recent 1000 events across all etcd keys.
**Note**: etcd only keeps the responses of the most recent 1000 events across all etcd keys.
It is recommended to send the response to another thread to process immediately
instead of blocking the watch while processing the result.
instead of blocking the watch while processing the result.
#### Watch from cleared event index
If we miss all the 1000 events, we need to recover the current state of the
If we miss all the 1000 events, we need to recover the current state of the
watching key space through a get and then start to watch from the
`X-Etcd-Index` + 1.
@ -371,7 +366,7 @@ To start watch, first we need to fetch the current state of key `/foo`:
curl 'http://127.0.0.1:2379/v2/keys/foo' -vv
```
```
```
< HTTP/1.1 200 OK
< Content-Type: application/json
< X-Etcd-Cluster-Id: 7e27652122e8b2ae
@ -380,7 +375,7 @@ curl 'http://127.0.0.1:2379/v2/keys/foo' -vv
< X-Raft-Term: 2
< Date: Mon, 05 Jan 2015 18:54:43 GMT
< Transfer-Encoding: chunked
<
<
{"action":"get","node":{"key":"/foo","value":"bar","modifiedIndex":7,"createdIndex":7}}
```

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# etcd3 API
TODO: API doc

View File

@ -1,18 +1,13 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# v2 Auth and Security
## etcd Resources
## etcd Resources
There are three types of resources in etcd
1. permission resources: users and roles in the user store
2. key-value resources: key-value pairs in the key-value store
3. settings resources: security settings, auth settings, and dynamic etcd cluster settings (election/heartbeat)
### Permission Resources
### Permission Resources
#### Users
A user is an identity to be authenticated. Each user can have multiple roles. The user has a capability (such as reading or writing) on the resource if one of the roles has that capability.
@ -20,7 +15,7 @@ A user is an identity to be authenticated. Each user can have multiple roles. Th
A user named `root` is required before authentication can be enabled, and it always has the ROOT role. The ROOT role can be granted to multiple users, but `root` is required for recovery purposes.
#### Roles
Each role has exact one associated Permission List. An permission list exists for each permission on key-value resources.
Each role has exact one associated Permission List. An permission list exists for each permission on key-value resources.
The special static ROOT (named `root`) role has a full permissions on all key-value resources, the permission to manage user resources and settings resources. Only the ROOT role has the permission to manage user resources and modify settings resources. The ROOT role is built-in and does not need to be created.
@ -35,8 +30,8 @@ A Permission List is a list of allowed patterns for that particular permission (
### Key-Value Resources
A key-value resource is a key-value pairs in the store. Given a list of matching patterns, permission for any given key in a request is granted if any of the patterns in the list match.
Only prefixes or exact keys are supported. A prefix permission string ends in `*`.
A permission on `/foo` is for that exact key or directory, not its children or recursively. `/foo*` is a prefix that matches `/foo` recursively, and all keys thereunder, and keys with that prefix (eg. `/foobar`. Contrast to the prefix `/foo/*`). `*` alone is permission on the full keyspace.
Only prefixes or exact keys are supported. A prefix permission string ends in `*`.
A permission on `/foo` is for that exact key or directory, not its children or recursively. `/foo*` is a prefix that matches `/foo` recursively, and all keys thereunder, and keys with that prefix (eg. `/foobar`. Contrast to the prefix `/foo/*`). `*` alone is permission on the full keyspace.
### Settings Resources
@ -71,7 +66,7 @@ An Error JSON corresponds to:
}
#### Enable and Disable Authentication
**Get auth status**
GET /v2/auth/enable
@ -220,8 +215,8 @@ PUT /v2/auth/users/charlie
Sent Headers:
Authorization: Basic <BasicAuthString>
Put Body:
JSON struct, above, matching the appropriate name
* Starting password and roles when creating.
JSON struct, above, matching the appropriate name
* Starting password and roles when creating.
* Grant/Revoke/Password filled in when updating (to grant roles, revoke roles, or change the password).
Possible Status Codes:
200 OK
@ -350,7 +345,7 @@ PUT /v2/auth/roles/rkt
401 Unauthorized
404 Not Found (update non-existent roles)
409 Conflict (when granting duplicated permission or revoking non-existent permission)
200 Body:
200 Body:
JSON state of the role
**Remove A Role**

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Authentication Guide
## Overview
@ -19,7 +14,7 @@ There is one special user, `root`, and there are two special roles, `root` and `
### User `root`
User `root` must be created before security can be activated. It has the `root` role and allows for the changing of anything inside etcd. The idea behind the `root` user is for recovery purposes -- a password is generated and stored somewhere -- and the root role is granted to the administrator accounts on the system. In the future, for troubleshooting and recovery, we will need to assume some access to the system, and future documentation will assume this root user (though anyone with the role will suffice).
User `root` must be created before security can be activated. It has the `root` role and allows for the changing of anything inside etcd. The idea behind the `root` user is for recovery purposes -- a password is generated and stored somewhere -- and the root role is granted to the administrator accounts on the system. In the future, for troubleshooting and recovery, we will need to assume some access to the system, and future documentation will assume this root user (though anyone with the role will suffice).
### Role `root`
@ -109,7 +104,7 @@ $ etcdctl role grant myrolename -path '/foo/bar' -write
$ etcdctl role grant myrolename -path '/pub/*' -readwrite
```
Beware that
Beware that
```
# Give full access to keys under /pub??
@ -138,12 +133,12 @@ $ etcdctl role remove myrolename
## Enabling authentication
The minimal steps to enabling auth are as follows. The administrator can set up users and roles before or after enabling authentication, as a matter of preference.
The minimal steps to enabling auth are as follows. The administrator can set up users and roles before or after enabling authentication, as a matter of preference.
Make sure the root user is created:
```
$ etcdctl user add root
$ etcdctl user add root
New password:
```

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Backward Compatibility
The main goal of etcd 2.0 release is to improve cluster safety around bootstrapping and dynamic reconfiguration. To do this, we deprecated the old error-prone APIs and provide a new set of APIs.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
# Benchmarks
etcd benchmarks will be published regularly and tracked for each release below:

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
## Physical machines
GCE n1-highcpu-2 machine type

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
# Benchmarking etcd v2.2.0
## Physical Machines

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
## Physical machines
GCE n1-highcpu-2 machine type

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
## Physical machine
GCE n1-standard-2 machine type

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
## Physical machines
GCE n1-highcpu-2 machine type

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
# Watch Memory Usage Benchmark
*NOTE*: The watch features are under active development, and their memory usage may change as that development progresses. We do not expect it to significantly increase beyond the figures stated below.
@ -10,10 +5,10 @@
A primary goal of etcd is supporting a very large number of watchers doing a massively large amount of watching. etcd aims to support O(10k) clients, O(100K) watch streams (O(10) streams per client) and O(10M) total watchings (O(100) watching per stream). The memory consumed by each individual watching accounts for the largest portion of etcd's overall usage, and is therefore the focus of current and future optimizations.
Three related components of etcd watch consume physical memory: each `grpc.Conn`, each watch stream, and each instance of the watching activity. `grpc.Conn` maintains the actual TCP connection and other gRPC connection state. Each `grpc.Conn` consumes O(10kb) of memory, and might have multiple watch streams attached.
Three related components of etcd watch consume physical memory: each `grpc.Conn`, each watch stream, and each instance of the watching activity. `grpc.Conn` maintains the actual TCP connection and other gRPC connection state. Each `grpc.Conn` consumes O(10kb) of memory, and might have multiple watch streams attached.
Each watch stream is an independent HTTP2 connection which consumes another O(10kb) of memory.
Multiple watchings might share one watch stream.
Each watch stream is an independent HTTP2 connection which consumes another O(10kb) of memory.
Multiple watchings might share one watch stream.
Watching is the actual struct that tracks the changes on the key-value store. Each watching should only consume < O(1kb).

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
# Storage Memory Usage Benchmark
<!---todo: link storage to storage design doc-->
@ -65,7 +60,7 @@ GCE n1-standard-2 machine type
In this test, we only benchmark the memory usage of the in-memory index. The goal is to find `c1` and `c2` mentioned above and to understand the hard limit of memory consumption of the storage.
We calculate the memory usage consumption via the Go runtime.ReadMemStats. We calculate the total allocated bytes difference before creating the index and after creating the index. It cannot perfectly reflect the memory usage of the in-memory index itself but can show the rough consumption pattern.
We calculate the memory usage consumption via the Go runtime.ReadMemStats. We calculate the total allocated bytes difference before creating the index and after creating the index. It cannot perfectly reflect the memory usage of the in-memory index itself but can show the rough consumption pattern.
| N | versions | key size | memory usage |
|------|----------|----------|--------------|

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Branch Management
## Guide

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Clustering Guide
## Overview

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Configuration Flags
etcd is configurable through command-line flags and environment variables. Options set on the command line take precedence over those from the environment.

View File

@ -1,13 +1,8 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
# etcd release guide
The guide talks about how to release a new version of etcd.
The procedure includes some manual steps for sanity checking but it can probably be further scripted. Please keep this document up-to-date if you want to make changes to the release process.
The procedure includes some manual steps for sanity checking but it can probably be further scripted. Please keep this document up-to-date if you want to make changes to the release process.
## Prepare Release

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Discovery Service Protocol
Discovery service protocol helps new etcd member to discover all other members in cluster bootstrap phase using a shared discovery URL.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Running etcd under Docker
The following guide will show you how to run etcd under Docker using the [static bootstrap process](clustering.md#static).

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Error Code
======

View File

@ -62,7 +62,7 @@ ALERT HTTPRequestsSlow
}
ANNOTATIONS {
summary = "slow HTTP requests",
description = "on etcd instance {{ $labels.instance }} HTTP requests to {{ $label.method }} are slow",
description = "on etcd instance {{ $labels.instance }} HTTP requests to {{ $labels.method }} are slow",
}
### File descriptor alerts ###

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# FAQ
## 1) Why can an etcd client read an old version of data when a majority of the etcd cluster members are down?

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Glossary
This document defines the various terms used in etcd documentation, command line and source code.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# FAQ
## Initial Bootstrapping UX

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Versioning
Goal: We want to be able to upgrade an individual peer in an etcd cluster to a newer version of etcd.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Libraries and Tools
**Tools**

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Members API
* [List members](#list-members)

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Metrics
etcd uses [Prometheus][prometheus] for metrics reporting. The metrics can be used for real-time monitoring and debugging. etcd does not persist its metrics; if a member restarts, the metrics will be reset.
@ -19,9 +14,9 @@ The metrics under the `etcd` prefix are for monitoring and alerting. They are st
### http requests
These metrics describe the serving of requests (non-watch events) served by etcd members in non-proxy mode: total
These metrics describe the serving of requests (non-watch events) served by etcd members in non-proxy mode: total
incoming requests, request failures and processing latency (inc. raft rounds for storage). They are useful for tracking
user-generated traffic hitting the etcd cluster .
user-generated traffic hitting the etcd cluster .
All these metrics are prefixed with `etcd_http_`
@ -33,20 +28,20 @@ All these metrics are prefixed with `etcd_http_`
Example Prometheus queries that may be useful from these metrics (across all etcd members):
* `sum(rate(etcd_http_failed_total{job="etcd"}[1m]) by (method) / sum(rate(etcd_http_events_received_total{job="etcd"})[1m]) by (method)`
* `sum(rate(etcd_http_failed_total{job="etcd"}[1m]) by (method) / sum(rate(etcd_http_events_received_total{job="etcd"})[1m]) by (method)`
Shows the fraction of events that failed by HTTP method across all members, across a time window of `1m`.
* `sum(rate(etcd_http_received_total{job="etcd",method="GET})[1m]) by (method)`
`sum(rate(etcd_http_received_total{job="etcd",method~="GET})[1m]) by (method)`
Shows the rate of successful readonly/write queries across all servers, across a time window of `1m`.
* `histogram_quantile(0.9, sum(rate(etcd_http_successful_duration_seconds{job="etcd",method="GET"}[5m]) ) by (le))`
`histogram_quantile(0.9, sum(rate(etcd_http_successful_duration_seconds{job="etcd",method!="GET"}[5m]) ) by (le))`
Show the 0.90-tile latency (in seconds) of read/write (respectively) event handling across all members, with a window of `5m`.
Show the 0.90-tile latency (in seconds) of read/write (respectively) event handling across all members, with a window of `5m`.
### proxy
@ -61,21 +56,21 @@ All these metrics are prefixed with `etcd_proxy_`
| requests_total | Total number of requests by this proxy instance. | Counter(method) |
| handled_total | Total number of fully handled requests, with responses from etcd members. | Counter(method) |
| dropped_total | Total number of dropped requests due to forwarding errors to etcd members.  | Counter(method,error) |
| handling_duration_seconds | Bucketed handling times by HTTP method, including round trip to member instances. | Histogram(method) |
| handling_duration_seconds | Bucketed handling times by HTTP method, including round trip to member instances. | Histogram(method) |
Example Prometheus queries that may be useful from these metrics (across all etcd servers):
* `sum(rate(etcd_proxy_handled_total{job="etcd"}[1m])) by (method)`
Rate of requests (by HTTP method) handled by all proxies, across a window of `1m`.
Rate of requests (by HTTP method) handled by all proxies, across a window of `1m`.
* `histogram_quantile(0.9, sum(rate(handling_duration_seconds{job="etcd",method="GET"}[5m])) by (le))`
`histogram_quantile(0.9, sum(rate(handling_duration_seconds{job="etcd",method!="GET"}[5m])) by (le))`
Show the 0.90-tile latency (in seconds) of handling of user requests across all proxy machines, with a window of `5m`.
Show the 0.90-tile latency (in seconds) of handling of user requests across all proxy machines, with a window of `5m`.
* `sum(rate(etcd_proxy_dropped_total{job="etcd"}[1m])) by (proxying_error)`
Number of failed request on the proxy. This should be 0, spikes here indicate connectivity issues to the etcd cluster.
## etcd_debugging namespace metrics

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Miscellaneous APIs
* [Getting the etcd version](#getting-the-etcd-version)

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
# FreeBSD
Starting with version 0.1.2 both etcd and etcdctl have been ported to FreeBSD and can

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Production Users
This document tracks people and use cases for etcd in production. By creating a list of production use cases we hope to build a community of advisors that we can reach out to with experience using various etcd applications, operation environments, and cluster sizes. The etcd development team may reach out periodically to check-in on your experience and update this list.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Proxy
etcd can run as a transparent proxy. Doing so allows for easy discovery of etcd within your infrastructure, since it can run on each machine as a local service. In this mode, etcd acts as a reverse proxy and forwards client requests to an active etcd cluster. The etcd proxy does not participate in the consensus replication of the etcd cluster, thus it neither increases the resilience nor decreases the write performance of the etcd cluster.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Reporting Bugs
If you find bugs or documentation mistakes in the etcd project, please let us know by [opening an issue][etcd-issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../../docs.md#documentation
# Overview
The etcd v3 API is designed to give users a more efficient and cleaner abstraction compared to etcd v2. There are a number of semantic and protocol changes in this new API. For an overview [see Xiang Li's video](https://youtu.be/J5AioGtEPeQ?t=211).

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Runtime Reconfiguration
etcd comes with support for incremental runtime reconfiguration, which allows users to update the membership of the cluster at run time.
@ -66,9 +61,9 @@ A wrongly updated client URL will not affect the health of the etcd cluster.
#### Update advertise peer URLs
If you would like to update the advertise peer URLs of a member, you have to first update
If you would like to update the advertise peer URLs of a member, you have to first update
it explicitly via member command and then restart the member. The additional action is required
since updating peer URLs changes the cluster wide configuration and can affect the health of the etcd cluster.
since updating peer URLs changes the cluster wide configuration and can affect the health of the etcd cluster.
To update the peer URLs, first, we need to find the target member's ID. You can list all members with `etcdctl`:

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Design of Runtime Reconfiguration
Runtime reconfiguration is one of the hardest and most error prone features in a distributed system, especially in a consensus based system like etcd.
@ -31,21 +26,21 @@ We think runtime reconfiguration should be a low frequent operation. We made the
If a cluster permanently loses a majority of its members, a new cluster will need to be started from an old data directory to recover the previous state.
It is entirely possible to force removing the failed members from the existing cluster to recover. However, we decided not to support this method since it bypasses the normal consensus committing phase, which is unsafe. If the member to remove is not actually dead or you force to remove different members through different members in the same cluster, you will end up with diverged cluster with same clusterID. This is very dangerous and hard to debug/fix afterwards.
It is entirely possible to force removing the failed members from the existing cluster to recover. However, we decided not to support this method since it bypasses the normal consensus committing phase, which is unsafe. If the member to remove is not actually dead or you force to remove different members through different members in the same cluster, you will end up with diverged cluster with same clusterID. This is very dangerous and hard to debug/fix afterwards.
If you have a correct deployment, the possibility of permanent majority lose is very low. But it is a severe enough problem that worth special care. We strongly suggest you to read the [disaster recovery documentation][disaster-recovery] and prepare for permanent majority lose before you put etcd into production.
## Do Not Use Public Discovery Service For Runtime Reconfiguration
The public discovery service should only be used for bootstrapping a cluster. To join member into an existing cluster, you should use runtime reconfiguration API.
The public discovery service should only be used for bootstrapping a cluster. To join member into an existing cluster, you should use runtime reconfiguration API.
Discovery service is designed for bootstrapping an etcd cluster in the cloud environment, when you do not know the IP addresses of all the members beforehand. After you successfully bootstrap a cluster, the IP addresses of all the members are known. Technically, you should not need the discovery service any more.
It seems that using public discovery service is a convenient way to do runtime reconfiguration, after all discovery service already has all the cluster configuration information. However relying on public discovery service brings troubles:
It seems that using public discovery service is a convenient way to do runtime reconfiguration, after all discovery service already has all the cluster configuration information. However relying on public discovery service brings troubles:
1. it introduces external dependencies for the entire life-cycle of your cluster, not just bootstrap time. If there is a network issue between your cluster and public discovery service, your cluster will suffer from it.
2. public discovery service must reflect correct runtime configuration of your cluster during it life-cycle. It has to provide security mechanism to avoid bad actions, and it is hard.
2. public discovery service must reflect correct runtime configuration of your cluster during it life-cycle. It has to provide security mechanism to avoid bad actions, and it is hard.
3. public discovery service has to keep tens of thousands of cluster configurations. Our public discovery service backend is not ready for that workload.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Security Model
etcd supports SSL/TLS as well as authentication through client certificates, both for clients to server as well as peer (server to server / cluster) communication.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Tuning
The default settings in etcd should work well for installations on a local network where the average network latency is low.

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Upgrade etcd to 2.1
In the general case, upgrading from etcd 2.0 to 2.1 can be a zero-downtime, rolling upgrade:
@ -17,11 +12,11 @@ Before [starting an upgrade](#upgrade-procedure), read through the rest of this
To upgrade an existing etcd deployment to 2.1, you must be running 2.0. If youre running a version of etcd before 2.0, you must upgrade to [2.0][v2.0] before upgrading to 2.1.
Also, to ensure a smooth rolling upgrade, your running cluster must be healthy. You can check the health of the cluster by using `etcdctl cluster-health` command.
Also, to ensure a smooth rolling upgrade, your running cluster must be healthy. You can check the health of the cluster by using `etcdctl cluster-health` command.
### Preparedness
### Preparedness
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
You might also want to [backup your data directory][backup-datastore] for a potential [downgrade](#downgrade).
@ -43,7 +38,7 @@ If you have even more data, this might take more time. If you have a data size l
### Downgrade
If all members have been upgraded to v2.1, the cluster will be upgraded to v2.1, and downgrade is **not possible**. If any member is still v2.0, the cluster will remain in v2.0, and you can go back to use v2.0 binary.
If all members have been upgraded to v2.1, the cluster will be upgraded to v2.1, and downgrade is **not possible**. If any member is still v2.0, the cluster will remain in v2.0, and you can go back to use v2.0 binary.
Please [backup your data directory][backup-datastore] of all etcd members if you want to downgrade the cluster, even if it is upgraded.
@ -101,7 +96,7 @@ member 924e2e83e93f2560 is healthy
member a8266ecf031671f3 is healthy
```
#### 4. Repeat step 2 to step 3 for all other members
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
# Upgrade etcd from 2.1 to 2.2
In the general case, upgrading from etcd 2.1 to 2.2 can be a zero-downtime, rolling upgrade:
@ -18,11 +13,11 @@ Before [starting an upgrade](#upgrade-procedure), read through the rest of this
To upgrade an existing etcd deployment to 2.2, you must be running 2.1. If youre running a version of etcd before 2.1, you must upgrade to [2.1][v2.1] before upgrading to 2.2.
Also, to ensure a smooth rolling upgrade, your running cluster must be healthy. You can check the health of the cluster by using `etcdctl cluster-health` command.
Also, to ensure a smooth rolling upgrade, your running cluster must be healthy. You can check the health of the cluster by using `etcdctl cluster-health` command.
### Preparedness
### Preparedness
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
You might also want to [backup the data directory][backup-datastore] for a potential [downgrade].
@ -36,11 +31,11 @@ Internally, etcd members negotiate with each other to determine the overall etcd
If you have a data size larger than 100MB you should contact us before upgrading, so we can make sure the upgrades work smoothly.
Every etcd 2.2 member will do health checking across the cluster periodically. etcd 2.1 member does not support health checking. During the upgrade, etcd 2.2 member will log warning about the unhealthy state of etcd 2.1 member. You can ignore the warning.
Every etcd 2.2 member will do health checking across the cluster periodically. etcd 2.1 member does not support health checking. During the upgrade, etcd 2.2 member will log warning about the unhealthy state of etcd 2.1 member. You can ignore the warning.
### Downgrade
If all members have been upgraded to v2.2, the cluster will be upgraded to v2.2, and downgrade is **not possible**. If any member is still v2.1, the cluster will remain in v2.1, and you can go back to use v2.1 binary.
If all members have been upgraded to v2.2, the cluster will be upgraded to v2.2, and downgrade is **not possible**. If any member is still v2.1, the cluster will remain in v2.1, and you can go back to use v2.1 binary.
Please [backup the data directory][backup-datastore] of all etcd members if you want to downgrade the cluster, even if it is upgraded.
@ -117,7 +112,7 @@ member a8266ecf031671f3 is healthy: got healthy result from http://localhost:123
cluster is healthy
```
#### 4. Repeat step 2 to step 3 for all other members
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish

View File

@ -1,8 +1,3 @@
**This is the documentation for etcd2 releases. Read [etcd3 doc][v3-docs] for etcd3 releases.**
[v3-docs]: ../docs.md#documentation
## Upgrade etcd from 2.2 to 2.3
In the general case, upgrading from etcd 2.2 to 2.3 can be a zero-downtime, rolling upgrade:

57
NEWS
View File

@ -1,3 +1,60 @@
etcd v3.2.0 (2017-06-09)
- improved backend read concurrency
- embedded etcd
- Etcd.Peers field is now []*peerListener
- RPCs
- add Election, Lock service
- native client etcdserver/api/v3client
- client "embedded" in the server
- v3 client
- LeaseTimeToLive returns TTL=-1 resp on lease not found
- clientv3.NewFromConfigFile is moved to clientv3/yaml.NewConfig
- STM prefetching
- add namespace feature
- concurrency package's elections updated to match RPC interfaces
- let client dial endpoints not in the balancer
- add ErrOldCluster with server version checking
- translate WithPrefix() into WithFromKey() for empty key
- v3 etcdctl
- add check perf command
- add --from-key flag to role grant-permission command
- lock command takes an optional command to execute
- etcd flags
- add --enable-v2 flag to configure v2 backend (enabled by default)
- add --auth-token flag
- gRPC proxy
- proxy endpoint discovery
- namespaces
- coalesce lease requests
- gateway
- support DNS SRV priority
- auth
- support Watch API
- JWT tokens
- logging, monitoring
- server warns large snapshot operations
- add 'etcd_debugging_server_lease_expired_total' metrics
- security
- deny incoming peer certs with wrong IP SAN
- resolve TLS DNSNames when SAN checking
- reload TLS certificates on every client connection
- release
- annotate acbuild with supports-systemd-notify
- add nsswitch.conf to Docker container image
- add ppc64le, arm64(experimental) builds
- Go 1.8.3
- gRPC v1.2.1
- grpc-gateway to v1.2.0
- v2
- allow snapshot over 512MB
etcd v3.1.9 (2017-06-09)
- allow v2 snapshot over 512MB
etcd v3.1.8 (2017-05-19)
etcd v3.1.7 (2017-04-28)
etcd v3.1.6 (2017-04-19)
- remove auth check in Status API
- fill in Auth API response header

View File

@ -97,7 +97,9 @@ func prepareOpts(opts map[string]string) (jwtSignMethod, jwtPubKeyPath, jwtPrivK
return "", "", "", ErrInvalidAuthOpts
}
}
if len(jwtSignMethod) == 0 {
return "", "", "", ErrInvalidAuthOpts
}
return jwtSignMethod, jwtPubKeyPath, jwtPrivKeyPath, nil
}

94
auth/jwt_test.go Normal file
View File

@ -0,0 +1,94 @@
// Copyright 2017 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package auth
import (
"context"
"testing"
)
const (
jwtPubKey = "../integration/fixtures/server.crt"
jwtPrivKey = "../integration/fixtures/server.key.insecure"
)
func TestJWTInfo(t *testing.T) {
opts := map[string]string{
"pub-key": jwtPubKey,
"priv-key": jwtPrivKey,
"sign-method": "RS256",
}
jwt, err := newTokenProviderJWT(opts)
if err != nil {
t.Fatal(err)
}
token, aerr := jwt.assign(context.TODO(), "abc", 123)
if aerr != nil {
t.Fatal(err)
}
ai, ok := jwt.info(context.TODO(), token, 123)
if !ok {
t.Fatalf("failed to authenticate with token %s", token)
}
if ai.Revision != 123 {
t.Fatalf("expected revision 123, got %d", ai.Revision)
}
ai, ok = jwt.info(context.TODO(), "aaa", 120)
if ok || ai != nil {
t.Fatalf("expected aaa to fail to authenticate, got %+v", ai)
}
}
func TestJWTBad(t *testing.T) {
opts := map[string]string{
"pub-key": jwtPubKey,
"priv-key": jwtPrivKey,
"sign-method": "RS256",
}
// private key instead of public key
opts["pub-key"] = jwtPrivKey
if _, err := newTokenProviderJWT(opts); err == nil {
t.Fatalf("expected failure on missing public key")
}
opts["pub-key"] = jwtPubKey
// public key instead of private key
opts["priv-key"] = jwtPubKey
if _, err := newTokenProviderJWT(opts); err == nil {
t.Fatalf("expected failure on missing public key")
}
opts["priv-key"] = jwtPrivKey
// missing signing option
delete(opts, "sign-method")
if _, err := newTokenProviderJWT(opts); err == nil {
t.Fatal("expected error on missing option")
}
opts["sign-method"] = "RS256"
// bad file for pubkey
opts["pub-key"] = "whatever"
if _, err := newTokenProviderJWT(opts); err == nil {
t.Fatalf("expected failure on missing public key")
}
opts["pub-key"] = jwtPubKey
// bad file for private key
opts["priv-key"] = "whatever"
if _, err := newTokenProviderJWT(opts); err == nil {
t.Fatalf("expeceted failure on missing private key")
}
opts["priv-key"] = jwtPrivKey
}

View File

@ -118,6 +118,8 @@ func (t *tokenSimple) genTokenPrefix() (string, error) {
func (t *tokenSimple) assignSimpleTokenToUser(username, token string) {
t.simpleTokensMu.Lock()
defer t.simpleTokensMu.Unlock()
_, ok := t.simpleTokens[token]
if ok {
plog.Panicf("token %s is alredy used", token)
@ -125,7 +127,6 @@ func (t *tokenSimple) assignSimpleTokenToUser(username, token string) {
t.simpleTokens[token] = username
t.simpleTokenKeeper.addSimpleToken(token)
t.simpleTokensMu.Unlock()
}
func (t *tokenSimple) invalidateUser(username string) {

View File

@ -162,6 +162,9 @@ type AuthStore interface {
// AuthInfoFromTLS gets AuthInfo from TLS info of gRPC's context
AuthInfoFromTLS(ctx context.Context) *AuthInfo
// WithRoot generates and installs a token that can be used as a root credential
WithRoot(ctx context.Context) context.Context
}
type TokenProvider interface {
@ -997,8 +1000,12 @@ func (as *authStore) AuthInfoFromCtx(ctx context.Context) (*AuthInfo, error) {
return nil, nil
}
ts, tok := md["token"]
if !tok {
//TODO(mitake|hexfusion) review unifying key names
ts, ok := md["token"]
if !ok {
ts, ok = md["authorization"]
}
if !ok {
return nil, nil
}
@ -1008,6 +1015,7 @@ func (as *authStore) AuthInfoFromCtx(ctx context.Context) (*AuthInfo, error) {
plog.Warningf("invalid auth token: %s", token)
return nil, ErrInvalidAuthToken
}
return authInfo, nil
}
@ -1057,3 +1065,35 @@ func NewTokenProvider(tokenOpts string, indexWaiter func(uint64) <-chan struct{}
return nil, ErrInvalidAuthOpts
}
}
func (as *authStore) WithRoot(ctx context.Context) context.Context {
if !as.isAuthEnabled() {
return ctx
}
var ctxForAssign context.Context
if ts := as.tokenProvider.(*tokenSimple); ts != nil {
ctx1 := context.WithValue(ctx, "index", uint64(0))
prefix, err := ts.genTokenPrefix()
if err != nil {
plog.Errorf("failed to generate prefix of internally used token")
return ctx
}
ctxForAssign = context.WithValue(ctx1, "simpleToken", prefix)
} else {
ctxForAssign = ctx
}
token, err := as.tokenProvider.assign(ctxForAssign, "root", as.Revision())
if err != nil {
// this must not happen
plog.Errorf("failed to assign token for lease revoking: %s", err)
return ctx
}
mdMap := map[string]string{
"token": token,
}
tokenMD := metadata.New(mdMap)
return metadata.NewContext(ctx, tokenMD)
}

View File

@ -27,19 +27,19 @@
]
},
{
"project": "github.com/cockroachdb/cmux",
"project": "github.com/boltdb/bolt",
"licenses": [
{
"type": "Apache License 2.0",
"type": "MIT License",
"confidence": 1
}
]
},
{
"project": "github.com/coreos/bbolt",
"project": "github.com/cockroachdb/cmux",
"licenses": [
{
"type": "MIT License",
"type": "Apache License 2.0",
"confidence": 1
}
]
@ -346,7 +346,7 @@
]
},
{
"project": "google.golang.org/genproto/googleapis",
"project": "google.golang.org/genproto/googleapis/rpc/status",
"licenses": [
{
"type": "Apache License 2.0",
@ -358,8 +358,8 @@
"project": "google.golang.org/grpc",
"licenses": [
{
"type": "Apache License 2.0",
"confidence": 1
"type": "BSD 3-clause \"New\" or \"Revised\" License",
"confidence": 0.979253112033195
}
]
},

View File

@ -372,7 +372,12 @@ func (c *httpClusterClient) Do(ctx context.Context, act httpAction) (*http.Respo
if err == context.Canceled || err == context.DeadlineExceeded {
return nil, nil, err
}
} else if resp.StatusCode/100 == 5 {
if isOneShot {
return nil, nil, err
}
continue
}
if resp.StatusCode/100 == 5 {
switch resp.StatusCode {
case http.StatusInternalServerError, http.StatusServiceUnavailable:
// TODO: make sure this is a no leader response
@ -380,16 +385,10 @@ func (c *httpClusterClient) Do(ctx context.Context, act httpAction) (*http.Respo
default:
cerr.Errors = append(cerr.Errors, fmt.Errorf("client: etcd member %s returns server error [%s]", eps[k].String(), http.StatusText(resp.StatusCode)))
}
err = cerr.Errors[0]
}
if err != nil {
if !isOneShot {
continue
if isOneShot {
return nil, nil, cerr.Errors[0]
}
c.Lock()
c.pinned = (k + 1) % leps
c.Unlock()
return nil, nil, err
continue
}
if k != pinned {
c.Lock()

View File

@ -16,7 +16,6 @@ package client
import (
"errors"
"fmt"
"io"
"io/ioutil"
"math/rand"
@ -305,9 +304,7 @@ func TestHTTPClusterClientDo(t *testing.T) {
fakeErr := errors.New("fake!")
fakeURL := url.URL{}
tests := []struct {
client *httpClusterClient
ctx context.Context
client *httpClusterClient
wantCode int
wantErr error
wantPinned int
@ -398,30 +395,10 @@ func TestHTTPClusterClientDo(t *testing.T) {
wantCode: http.StatusTeapot,
wantPinned: 1,
},
// 500-level errors cause one shot Do to fallthrough to next endpoint
{
client: &httpClusterClient{
endpoints: []url.URL{fakeURL, fakeURL},
clientFactory: newStaticHTTPClientFactory(
[]staticHTTPResponse{
{resp: http.Response{StatusCode: http.StatusBadGateway}},
{resp: http.Response{StatusCode: http.StatusTeapot}},
},
),
rand: rand.New(rand.NewSource(0)),
},
ctx: context.WithValue(context.Background(), &oneShotCtxValue, &oneShotCtxValue),
wantErr: fmt.Errorf("client: etcd member returns server error [Bad Gateway]"),
wantPinned: 1,
},
}
for i, tt := range tests {
if tt.ctx == nil {
tt.ctx = context.Background()
}
resp, _, err := tt.client.Do(tt.ctx, nil)
resp, _, err := tt.client.Do(context.Background(), nil)
if !reflect.DeepEqual(tt.wantErr, err) {
t.Errorf("#%d: got err=%v, want=%v", i, err, tt.wantErr)
continue
@ -430,9 +407,11 @@ func TestHTTPClusterClientDo(t *testing.T) {
if resp == nil {
if tt.wantCode != 0 {
t.Errorf("#%d: resp is nil, want=%d", i, tt.wantCode)
continue
}
} else if resp.StatusCode != tt.wantCode {
continue
}
if resp.StatusCode != tt.wantCode {
t.Errorf("#%d: resp code=%d, want=%d", i, resp.StatusCode, tt.wantCode)
continue
}

View File

@ -59,7 +59,7 @@ Use a custom context to set timeouts on your operations:
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
// set a new key, ignoring it's previous state
// set a new key, ignoring its previous state
_, err := kAPI.Set(ctx, "/ping", "pong", nil)
if err != nil {
if err == context.DeadlineExceeded {

View File

@ -0,0 +1,93 @@
// Copyright 2017 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package client_test
import (
"fmt"
"log"
"sort"
"github.com/coreos/etcd/client"
"golang.org/x/net/context"
)
func ExampleKeysAPI_directory() {
c, err := client.New(client.Config{
Endpoints: exampleEndpoints,
Transport: exampleTransport,
})
if err != nil {
log.Fatal(err)
}
kapi := client.NewKeysAPI(c)
// Setting '/myNodes' to create a directory that will hold some keys.
o := client.SetOptions{Dir: true}
resp, err := kapi.Set(context.Background(), "/myNodes", "", &o)
if err != nil {
log.Fatal(err)
}
// Add keys to /myNodes directory.
resp, err = kapi.Set(context.Background(), "/myNodes/key1", "value1", nil)
if err != nil {
log.Fatal(err)
}
resp, err = kapi.Set(context.Background(), "/myNodes/key2", "value2", nil)
if err != nil {
log.Fatal(err)
}
// fetch directory
resp, err = kapi.Get(context.Background(), "/myNodes", nil)
if err != nil {
log.Fatal(err)
}
// print directory keys
sort.Sort(resp.Node.Nodes)
for _, n := range resp.Node.Nodes {
fmt.Printf("Key: %q, Value: %q\n", n.Key, n.Value)
}
// Output:
// Key: "/myNodes/key1", Value: "value1"
// Key: "/myNodes/key2", Value: "value2"
}
func ExampleKeysAPI_setget() {
c, err := client.New(client.Config{
Endpoints: exampleEndpoints,
Transport: exampleTransport,
})
if err != nil {
log.Fatal(err)
}
kapi := client.NewKeysAPI(c)
// Set key "/foo" to value "bar".
resp, err := kapi.Set(context.Background(), "/foo", "bar", nil)
if err != nil {
log.Fatal(err)
}
// Get key "/foo"
resp, err = kapi.Get(context.Background(), "/foo", nil)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%q key has %q value\n", resp.Node.Key, resp.Node.Value)
// Output: "/foo" key has "bar" value
}

77
client/main_test.go Normal file
View File

@ -0,0 +1,77 @@
// Copyright 2017 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package client_test
import (
"fmt"
"net/http"
"os"
"regexp"
"strings"
"testing"
"time"
"github.com/coreos/etcd/integration"
"github.com/coreos/etcd/pkg/testutil"
"github.com/coreos/etcd/pkg/transport"
)
var exampleEndpoints []string
var exampleTransport *http.Transport
// TestMain sets up an etcd cluster if running the examples.
func TestMain(m *testing.M) {
useCluster, hasRunArg := false, false // default to running only Test*
for _, arg := range os.Args {
if strings.HasPrefix(arg, "-test.run=") {
exp := strings.Split(arg, "=")[1]
match, err := regexp.MatchString(exp, "Example")
useCluster = (err == nil && match) || strings.Contains(exp, "Example")
hasRunArg = true
break
}
}
if !hasRunArg {
// force only running Test* if no args given to avoid leak false
// positives from having a long-running cluster for the examples.
os.Args = append(os.Args, "-test.run=Test")
}
v := 0
if useCluster {
tr, trerr := transport.NewTransport(transport.TLSInfo{}, time.Second)
if trerr != nil {
fmt.Fprintf(os.Stderr, "%v", trerr)
os.Exit(1)
}
cfg := integration.ClusterConfig{Size: 1}
clus := integration.NewClusterV3(nil, &cfg)
exampleEndpoints = []string{clus.Members[0].URL()}
exampleTransport = tr
v = m.Run()
clus.Terminate(nil)
if err := testutil.CheckAfterTest(time.Second); err != nil {
fmt.Fprintf(os.Stderr, "%v", err)
os.Exit(1)
}
} else {
v = m.Run()
}
if v == 0 && testutil.CheckLeakedGoroutine() {
os.Exit(1)
}
os.Exit(v)
}

View File

@ -44,7 +44,7 @@ type Member struct {
PeerURLs []string `json:"peerURLs"`
// ClientURLs represents the HTTP(S) endpoints on which this Member
// serves it's client-facing APIs.
// serves its client-facing APIs.
ClientURLs []string `json:"clientURLs"`
}

View File

@ -1,6 +1,6 @@
# etcd/clientv3
[![Godoc](https://img.shields.io/badge/go-documentation-blue.svg?style=flat-square)](https://godoc.org/github.com/coreos/etcd/clientv3)
[![Godoc](http://img.shields.io/badge/go-documentation-blue.svg?style=flat-square)](https://godoc.org/github.com/coreos/etcd/clientv3)
`etcd/clientv3` is the official Go etcd client for v3.

View File

@ -20,7 +20,6 @@ import (
"github.com/coreos/etcd/auth/authpb"
pb "github.com/coreos/etcd/etcdserver/etcdserverpb"
"golang.org/x/net/context"
"google.golang.org/grpc"
)
@ -105,16 +104,16 @@ type auth struct {
}
func NewAuth(c *Client) Auth {
return &auth{remote: RetryAuthClient(c)}
return &auth{remote: pb.NewAuthClient(c.ActiveConnection())}
}
func (auth *auth) AuthEnable(ctx context.Context) (*AuthEnableResponse, error) {
resp, err := auth.remote.AuthEnable(ctx, &pb.AuthEnableRequest{})
resp, err := auth.remote.AuthEnable(ctx, &pb.AuthEnableRequest{}, grpc.FailFast(false))
return (*AuthEnableResponse)(resp), toErr(ctx, err)
}
func (auth *auth) AuthDisable(ctx context.Context) (*AuthDisableResponse, error) {
resp, err := auth.remote.AuthDisable(ctx, &pb.AuthDisableRequest{})
resp, err := auth.remote.AuthDisable(ctx, &pb.AuthDisableRequest{}, grpc.FailFast(false))
return (*AuthDisableResponse)(resp), toErr(ctx, err)
}
@ -139,12 +138,12 @@ func (auth *auth) UserGrantRole(ctx context.Context, user string, role string) (
}
func (auth *auth) UserGet(ctx context.Context, name string) (*AuthUserGetResponse, error) {
resp, err := auth.remote.UserGet(ctx, &pb.AuthUserGetRequest{Name: name})
resp, err := auth.remote.UserGet(ctx, &pb.AuthUserGetRequest{Name: name}, grpc.FailFast(false))
return (*AuthUserGetResponse)(resp), toErr(ctx, err)
}
func (auth *auth) UserList(ctx context.Context) (*AuthUserListResponse, error) {
resp, err := auth.remote.UserList(ctx, &pb.AuthUserListRequest{})
resp, err := auth.remote.UserList(ctx, &pb.AuthUserListRequest{}, grpc.FailFast(false))
return (*AuthUserListResponse)(resp), toErr(ctx, err)
}
@ -169,12 +168,12 @@ func (auth *auth) RoleGrantPermission(ctx context.Context, name string, key, ran
}
func (auth *auth) RoleGet(ctx context.Context, role string) (*AuthRoleGetResponse, error) {
resp, err := auth.remote.RoleGet(ctx, &pb.AuthRoleGetRequest{Role: role})
resp, err := auth.remote.RoleGet(ctx, &pb.AuthRoleGetRequest{Role: role}, grpc.FailFast(false))
return (*AuthRoleGetResponse)(resp), toErr(ctx, err)
}
func (auth *auth) RoleList(ctx context.Context) (*AuthRoleListResponse, error) {
resp, err := auth.remote.RoleList(ctx, &pb.AuthRoleListRequest{})
resp, err := auth.remote.RoleList(ctx, &pb.AuthRoleListRequest{}, grpc.FailFast(false))
return (*AuthRoleListResponse)(resp), toErr(ctx, err)
}
@ -202,7 +201,7 @@ type authenticator struct {
}
func (auth *authenticator) authenticate(ctx context.Context, name string, password string) (*AuthenticateResponse, error) {
resp, err := auth.remote.Authenticate(ctx, &pb.AuthenticateRequest{Name: name, Password: password})
resp, err := auth.remote.Authenticate(ctx, &pb.AuthenticateRequest{Name: name, Password: password}, grpc.FailFast(false))
return (*AuthenticateResponse)(resp), toErr(ctx, err)
}

356
clientv3/balancer.go Normal file
View File

@ -0,0 +1,356 @@
// Copyright 2016 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package clientv3
import (
"net/url"
"strings"
"sync"
"golang.org/x/net/context"
"google.golang.org/grpc"
"google.golang.org/grpc/codes"
)
// ErrNoAddrAvilable is returned by Get() when the balancer does not have
// any active connection to endpoints at the time.
// This error is returned only when opts.BlockingWait is true.
var ErrNoAddrAvilable = grpc.Errorf(codes.Unavailable, "there is no address available")
// simpleBalancer does the bare minimum to expose multiple eps
// to the grpc reconnection code path
type simpleBalancer struct {
// addrs are the client's endpoints for grpc
addrs []grpc.Address
// notifyCh notifies grpc of the set of addresses for connecting
notifyCh chan []grpc.Address
// readyc closes once the first connection is up
readyc chan struct{}
readyOnce sync.Once
// mu protects upEps, pinAddr, and connectingAddr
mu sync.RWMutex
// upc closes when upEps transitions from empty to non-zero or the balancer closes.
upc chan struct{}
// downc closes when grpc calls down() on pinAddr
downc chan struct{}
// stopc is closed to signal updateNotifyLoop should stop.
stopc chan struct{}
// donec closes when all goroutines are exited
donec chan struct{}
// updateAddrsC notifies updateNotifyLoop to update addrs.
updateAddrsC chan struct{}
// grpc issues TLS cert checks using the string passed into dial so
// that string must be the host. To recover the full scheme://host URL,
// have a map from hosts to the original endpoint.
host2ep map[string]string
// pinAddr is the currently pinned address; set to the empty string on
// intialization and shutdown.
pinAddr string
closed bool
}
func newSimpleBalancer(eps []string) *simpleBalancer {
notifyCh := make(chan []grpc.Address, 1)
addrs := make([]grpc.Address, len(eps))
for i := range eps {
addrs[i].Addr = getHost(eps[i])
}
sb := &simpleBalancer{
addrs: addrs,
notifyCh: notifyCh,
readyc: make(chan struct{}),
upc: make(chan struct{}),
stopc: make(chan struct{}),
downc: make(chan struct{}),
donec: make(chan struct{}),
updateAddrsC: make(chan struct{}, 1),
host2ep: getHost2ep(eps),
}
close(sb.downc)
go sb.updateNotifyLoop()
return sb
}
func (b *simpleBalancer) Start(target string, config grpc.BalancerConfig) error { return nil }
func (b *simpleBalancer) ConnectNotify() <-chan struct{} {
b.mu.Lock()
defer b.mu.Unlock()
return b.upc
}
func (b *simpleBalancer) getEndpoint(host string) string {
b.mu.Lock()
defer b.mu.Unlock()
return b.host2ep[host]
}
func getHost2ep(eps []string) map[string]string {
hm := make(map[string]string, len(eps))
for i := range eps {
_, host, _ := parseEndpoint(eps[i])
hm[host] = eps[i]
}
return hm
}
func (b *simpleBalancer) updateAddrs(eps []string) {
np := getHost2ep(eps)
b.mu.Lock()
match := len(np) == len(b.host2ep)
for k, v := range np {
if b.host2ep[k] != v {
match = false
break
}
}
if match {
// same endpoints, so no need to update address
b.mu.Unlock()
return
}
b.host2ep = np
addrs := make([]grpc.Address, 0, len(eps))
for i := range eps {
addrs = append(addrs, grpc.Address{Addr: getHost(eps[i])})
}
b.addrs = addrs
// updating notifyCh can trigger new connections,
// only update addrs if all connections are down
// or addrs does not include pinAddr.
update := !hasAddr(addrs, b.pinAddr)
b.mu.Unlock()
if update {
select {
case b.updateAddrsC <- struct{}{}:
case <-b.stopc:
}
}
}
func hasAddr(addrs []grpc.Address, targetAddr string) bool {
for _, addr := range addrs {
if targetAddr == addr.Addr {
return true
}
}
return false
}
func (b *simpleBalancer) updateNotifyLoop() {
defer close(b.donec)
for {
b.mu.RLock()
upc, downc, addr := b.upc, b.downc, b.pinAddr
b.mu.RUnlock()
// downc or upc should be closed
select {
case <-downc:
downc = nil
default:
}
select {
case <-upc:
upc = nil
default:
}
switch {
case downc == nil && upc == nil:
// stale
select {
case <-b.stopc:
return
default:
}
case downc == nil:
b.notifyAddrs()
select {
case <-upc:
case <-b.updateAddrsC:
b.notifyAddrs()
case <-b.stopc:
return
}
case upc == nil:
select {
// close connections that are not the pinned address
case b.notifyCh <- []grpc.Address{{Addr: addr}}:
case <-downc:
case <-b.stopc:
return
}
select {
case <-downc:
case <-b.updateAddrsC:
case <-b.stopc:
return
}
b.notifyAddrs()
}
}
}
func (b *simpleBalancer) notifyAddrs() {
b.mu.RLock()
addrs := b.addrs
b.mu.RUnlock()
select {
case b.notifyCh <- addrs:
case <-b.stopc:
}
}
func (b *simpleBalancer) Up(addr grpc.Address) func(error) {
b.mu.Lock()
defer b.mu.Unlock()
// gRPC might call Up after it called Close. We add this check
// to "fix" it up at application layer. Or our simplerBalancer
// might panic since b.upc is closed.
if b.closed {
return func(err error) {}
}
// gRPC might call Up on a stale address.
// Prevent updating pinAddr with a stale address.
if !hasAddr(b.addrs, addr.Addr) {
return func(err error) {}
}
if b.pinAddr != "" {
return func(err error) {}
}
// notify waiting Get()s and pin first connected address
close(b.upc)
b.downc = make(chan struct{})
b.pinAddr = addr.Addr
// notify client that a connection is up
b.readyOnce.Do(func() { close(b.readyc) })
return func(err error) {
b.mu.Lock()
b.upc = make(chan struct{})
close(b.downc)
b.pinAddr = ""
b.mu.Unlock()
}
}
func (b *simpleBalancer) Get(ctx context.Context, opts grpc.BalancerGetOptions) (grpc.Address, func(), error) {
var (
addr string
closed bool
)
// If opts.BlockingWait is false (for fail-fast RPCs), it should return
// an address it has notified via Notify immediately instead of blocking.
if !opts.BlockingWait {
b.mu.RLock()
closed = b.closed
addr = b.pinAddr
b.mu.RUnlock()
if closed {
return grpc.Address{Addr: ""}, nil, grpc.ErrClientConnClosing
}
if addr == "" {
return grpc.Address{Addr: ""}, nil, ErrNoAddrAvilable
}
return grpc.Address{Addr: addr}, func() {}, nil
}
for {
b.mu.RLock()
ch := b.upc
b.mu.RUnlock()
select {
case <-ch:
case <-b.donec:
return grpc.Address{Addr: ""}, nil, grpc.ErrClientConnClosing
case <-ctx.Done():
return grpc.Address{Addr: ""}, nil, ctx.Err()
}
b.mu.RLock()
closed = b.closed
addr = b.pinAddr
b.mu.RUnlock()
// Close() which sets b.closed = true can be called before Get(), Get() must exit if balancer is closed.
if closed {
return grpc.Address{Addr: ""}, nil, grpc.ErrClientConnClosing
}
if addr != "" {
break
}
}
return grpc.Address{Addr: addr}, func() {}, nil
}
func (b *simpleBalancer) Notify() <-chan []grpc.Address { return b.notifyCh }
func (b *simpleBalancer) Close() error {
b.mu.Lock()
// In case gRPC calls close twice. TODO: remove the checking
// when we are sure that gRPC wont call close twice.
if b.closed {
b.mu.Unlock()
<-b.donec
return nil
}
b.closed = true
close(b.stopc)
b.pinAddr = ""
// In the case of following scenario:
// 1. upc is not closed; no pinned address
// 2. client issues an rpc, calling invoke(), which calls Get(), enters for loop, blocks
// 3. clientconn.Close() calls balancer.Close(); closed = true
// 4. for loop in Get() never exits since ctx is the context passed in by the client and may not be canceled
// we must close upc so Get() exits from blocking on upc
select {
case <-b.upc:
default:
// terminate all waiting Get()s
close(b.upc)
}
b.mu.Unlock()
// wait for updateNotifyLoop to finish
<-b.donec
close(b.notifyCh)
return nil
}
func getHost(ep string) string {
url, uerr := url.Parse(ep)
if uerr != nil || !strings.Contains(ep, "://") {
return ep
}
return url.Host
}

239
clientv3/balancer_test.go Normal file
View File

@ -0,0 +1,239 @@
// Copyright 2016 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package clientv3
import (
"errors"
"net"
"sync"
"testing"
"time"
pb "github.com/coreos/etcd/etcdserver/etcdserverpb"
"github.com/coreos/etcd/pkg/testutil"
"golang.org/x/net/context"
"google.golang.org/grpc"
)
var (
endpoints = []string{"localhost:2379", "localhost:22379", "localhost:32379"}
)
func TestBalancerGetUnblocking(t *testing.T) {
sb := newSimpleBalancer(endpoints)
defer sb.Close()
if addrs := <-sb.Notify(); len(addrs) != len(endpoints) {
t.Errorf("Initialize newSimpleBalancer should have triggered Notify() chan, but it didn't")
}
unblockingOpts := grpc.BalancerGetOptions{BlockingWait: false}
_, _, err := sb.Get(context.Background(), unblockingOpts)
if err != ErrNoAddrAvilable {
t.Errorf("Get() with no up endpoints should return ErrNoAddrAvailable, got: %v", err)
}
down1 := sb.Up(grpc.Address{Addr: endpoints[1]})
if addrs := <-sb.Notify(); len(addrs) != 1 {
t.Errorf("first Up() should have triggered balancer to send the first connected address via Notify chan so that other connections can be closed")
}
down2 := sb.Up(grpc.Address{Addr: endpoints[2]})
addrFirst, putFun, err := sb.Get(context.Background(), unblockingOpts)
if err != nil {
t.Errorf("Get() with up endpoints should success, got %v", err)
}
if addrFirst.Addr != endpoints[1] {
t.Errorf("Get() didn't return expected address, got %v", addrFirst)
}
if putFun == nil {
t.Errorf("Get() returned unexpected nil put function")
}
addrSecond, _, _ := sb.Get(context.Background(), unblockingOpts)
if addrFirst.Addr != addrSecond.Addr {
t.Errorf("Get() didn't return the same address as previous call, got %v and %v", addrFirst, addrSecond)
}
down1(errors.New("error"))
if addrs := <-sb.Notify(); len(addrs) != len(endpoints) {
t.Errorf("closing the only connection should triggered balancer to send the all endpoints via Notify chan so that we can establish a connection")
}
down2(errors.New("error"))
_, _, err = sb.Get(context.Background(), unblockingOpts)
if err != ErrNoAddrAvilable {
t.Errorf("Get() with no up endpoints should return ErrNoAddrAvailable, got: %v", err)
}
}
func TestBalancerGetBlocking(t *testing.T) {
sb := newSimpleBalancer(endpoints)
defer sb.Close()
if addrs := <-sb.Notify(); len(addrs) != len(endpoints) {
t.Errorf("Initialize newSimpleBalancer should have triggered Notify() chan, but it didn't")
}
blockingOpts := grpc.BalancerGetOptions{BlockingWait: true}
ctx, _ := context.WithTimeout(context.Background(), time.Millisecond*100)
_, _, err := sb.Get(ctx, blockingOpts)
if err != context.DeadlineExceeded {
t.Errorf("Get() with no up endpoints should timeout, got %v", err)
}
downC := make(chan func(error), 1)
go func() {
// ensure sb.Up() will be called after sb.Get() to see if Up() releases blocking Get()
time.Sleep(time.Millisecond * 100)
f := sb.Up(grpc.Address{Addr: endpoints[1]})
if addrs := <-sb.Notify(); len(addrs) != 1 {
t.Errorf("first Up() should have triggered balancer to send the first connected address via Notify chan so that other connections can be closed")
}
downC <- f
}()
addrFirst, putFun, err := sb.Get(context.Background(), blockingOpts)
if err != nil {
t.Errorf("Get() with up endpoints should success, got %v", err)
}
if addrFirst.Addr != endpoints[1] {
t.Errorf("Get() didn't return expected address, got %v", addrFirst)
}
if putFun == nil {
t.Errorf("Get() returned unexpected nil put function")
}
down1 := <-downC
down2 := sb.Up(grpc.Address{Addr: endpoints[2]})
addrSecond, _, _ := sb.Get(context.Background(), blockingOpts)
if addrFirst.Addr != addrSecond.Addr {
t.Errorf("Get() didn't return the same address as previous call, got %v and %v", addrFirst, addrSecond)
}
down1(errors.New("error"))
if addrs := <-sb.Notify(); len(addrs) != len(endpoints) {
t.Errorf("closing the only connection should triggered balancer to send the all endpoints via Notify chan so that we can establish a connection")
}
down2(errors.New("error"))
ctx, _ = context.WithTimeout(context.Background(), time.Millisecond*100)
_, _, err = sb.Get(ctx, blockingOpts)
if err != context.DeadlineExceeded {
t.Errorf("Get() with no up endpoints should timeout, got %v", err)
}
}
// TestBalancerDoNotBlockOnClose ensures that balancer and grpc don't deadlock each other
// due to rapid open/close conn. The deadlock causes balancer.Close() to block forever.
// See issue: https://github.com/coreos/etcd/issues/7283 for more detail.
func TestBalancerDoNotBlockOnClose(t *testing.T) {
defer testutil.AfterTest(t)
kcl := newKillConnListener(t, 3)
defer kcl.close()
for i := 0; i < 5; i++ {
sb := newSimpleBalancer(kcl.endpoints())
conn, err := grpc.Dial("", grpc.WithInsecure(), grpc.WithBalancer(sb))
if err != nil {
t.Fatal(err)
}
kvc := pb.NewKVClient(conn)
<-sb.readyc
var wg sync.WaitGroup
wg.Add(100)
cctx, cancel := context.WithCancel(context.TODO())
for j := 0; j < 100; j++ {
go func() {
defer wg.Done()
kvc.Range(cctx, &pb.RangeRequest{}, grpc.FailFast(false))
}()
}
// balancer.Close() might block
// if balancer and grpc deadlock each other.
bclosec, cclosec := make(chan struct{}), make(chan struct{})
go func() {
defer close(bclosec)
sb.Close()
}()
go func() {
defer close(cclosec)
conn.Close()
}()
select {
case <-bclosec:
case <-time.After(3 * time.Second):
testutil.FatalStack(t, "balancer close timeout")
}
select {
case <-cclosec:
case <-time.After(3 * time.Second):
t.Fatal("grpc conn close timeout")
}
cancel()
wg.Wait()
}
}
// killConnListener listens incoming conn and kills it immediately.
type killConnListener struct {
wg sync.WaitGroup
eps []string
stopc chan struct{}
t *testing.T
}
func newKillConnListener(t *testing.T, size int) *killConnListener {
kcl := &killConnListener{stopc: make(chan struct{}), t: t}
for i := 0; i < size; i++ {
ln, err := net.Listen("tcp", ":0")
if err != nil {
t.Fatal(err)
}
kcl.eps = append(kcl.eps, ln.Addr().String())
kcl.wg.Add(1)
go kcl.listen(ln)
}
return kcl
}
func (kcl *killConnListener) endpoints() []string {
return kcl.eps
}
func (kcl *killConnListener) listen(l net.Listener) {
go func() {
defer kcl.wg.Done()
for {
conn, err := l.Accept()
select {
case <-kcl.stopc:
return
default:
}
if err != nil {
kcl.t.Fatal(err)
}
time.Sleep(1 * time.Millisecond)
conn.Close()
}
}()
<-kcl.stopc
l.Close()
}
func (kcl *killConnListener) close() {
close(kcl.stopc)
kcl.wg.Wait()
}

View File

@ -31,9 +31,7 @@ import (
"google.golang.org/grpc"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/credentials"
"google.golang.org/grpc/keepalive"
"google.golang.org/grpc/metadata"
"google.golang.org/grpc/status"
)
var (
@ -53,17 +51,18 @@ type Client struct {
conn *grpc.ClientConn
dialerrc chan error
cfg Config
creds *credentials.TransportCredentials
balancer *healthBalancer
mu sync.Mutex
cfg Config
creds *credentials.TransportCredentials
balancer *simpleBalancer
retryWrapper retryRpcFunc
retryAuthWrapper retryRpcFunc
ctx context.Context
cancel context.CancelFunc
// Username is a user name for authentication.
// Username is a username for authentication
Username string
// Password is a password for authentication.
// Password is a password for authentication
Password string
// tokenCred is an instance of WithPerRPCCredentials()'s argument
tokenCred *authTokenCredential
@ -117,23 +116,8 @@ func (c *Client) Endpoints() (eps []string) {
// SetEndpoints updates client's endpoints.
func (c *Client) SetEndpoints(eps ...string) {
c.mu.Lock()
c.cfg.Endpoints = eps
c.mu.Unlock()
c.balancer.updateAddrs(eps...)
// updating notifyCh can trigger new connections,
// need update addrs if all connections are down
// or addrs does not include pinAddr.
c.balancer.mu.RLock()
update := !hasAddr(c.balancer.addrs, c.balancer.pinAddr)
c.balancer.mu.RUnlock()
if update {
select {
case c.balancer.updateAddrsC <- notifyNext:
case <-c.balancer.stopc:
}
}
c.balancer.updateAddrs(eps)
}
// Sync synchronizes client's endpoints with the known endpoints from the etcd membership.
@ -160,10 +144,8 @@ func (c *Client) autoSync() {
case <-c.ctx.Done():
return
case <-time.After(c.cfg.AutoSyncInterval):
ctx, cancel := context.WithTimeout(c.ctx, 5*time.Second)
err := c.Sync(ctx)
cancel()
if err != nil && err != c.ctx.Err() {
ctx, _ := context.WithTimeout(c.ctx, 5*time.Second)
if err := c.Sync(ctx); err != nil && err != c.ctx.Err() {
logger.Println("Auto sync endpoints failed:", err)
}
}
@ -192,7 +174,7 @@ func parseEndpoint(endpoint string) (proto string, host string, scheme string) {
host = endpoint
url, uerr := url.Parse(endpoint)
if uerr != nil || !strings.Contains(endpoint, "://") {
return proto, host, scheme
return
}
scheme = url.Scheme
@ -206,7 +188,7 @@ func parseEndpoint(endpoint string) (proto string, host string, scheme string) {
default:
proto, host = "", ""
}
return proto, host, scheme
return
}
func (c *Client) processCreds(scheme string) (creds *credentials.TransportCredentials) {
@ -225,7 +207,7 @@ func (c *Client) processCreds(scheme string) (creds *credentials.TransportCreden
default:
creds = nil
}
return creds
return
}
// dialSetupOpts gives the dial opts prior to any authentication
@ -233,17 +215,10 @@ func (c *Client) dialSetupOpts(endpoint string, dopts ...grpc.DialOption) (opts
if c.cfg.DialTimeout > 0 {
opts = []grpc.DialOption{grpc.WithTimeout(c.cfg.DialTimeout)}
}
if c.cfg.DialKeepAliveTime > 0 {
params := keepalive.ClientParameters{
Time: c.cfg.DialKeepAliveTime,
Timeout: c.cfg.DialKeepAliveTimeout,
}
opts = append(opts, grpc.WithKeepaliveParams(params))
}
opts = append(opts, dopts...)
f := func(host string, t time.Duration) (net.Conn, error) {
proto, host, _ := parseEndpoint(c.balancer.endpoint(host))
proto, host, _ := parseEndpoint(c.balancer.getEndpoint(host))
if host == "" && endpoint != "" {
// dialing an endpoint not in the balancer; use
// endpoint passed into dial
@ -336,7 +311,7 @@ func (c *Client) dial(endpoint string, dopts ...grpc.DialOption) (*grpc.ClientCo
if err != nil {
if toErr(ctx, err) != rpctypes.ErrAuthNotEnabled {
if err == ctx.Err() && ctx.Err() != c.ctx.Err() {
err = context.DeadlineExceeded
err = grpc.ErrClientConnTimeout
}
return nil, err
}
@ -391,12 +366,9 @@ func newClient(cfg *Config) (*Client, error) {
client.Password = cfg.Password
}
client.balancer = newHealthBalancer(cfg.Endpoints, cfg.DialTimeout, func(ep string) (bool, error) {
return grpcHealthCheck(client, ep)
})
client.balancer = newSimpleBalancer(cfg.Endpoints)
// use Endpoints[0] so that for https:// without any tls config given, then
// grpc will assume the certificate server name is the endpoint host.
// grpc will assume the ServerName is in the endpoint.
conn, err := client.dial(cfg.Endpoints[0], grpc.WithBalancer(client.balancer))
if err != nil {
client.cancel()
@ -404,19 +376,21 @@ func newClient(cfg *Config) (*Client, error) {
return nil, err
}
client.conn = conn
client.retryWrapper = client.newRetryWrapper()
client.retryAuthWrapper = client.newAuthRetryWrapper()
// wait for a connection
if cfg.DialTimeout > 0 {
hasConn := false
waitc := time.After(cfg.DialTimeout)
select {
case <-client.balancer.ready():
case <-client.balancer.readyc:
hasConn = true
case <-ctx.Done():
case <-waitc:
}
if !hasConn {
err := context.DeadlineExceeded
err := grpc.ErrClientConnTimeout
select {
case err = <-client.dialerrc:
default:
@ -451,7 +425,7 @@ func (c *Client) checkVersion() (err error) {
errc := make(chan error, len(c.cfg.Endpoints))
ctx, cancel := context.WithCancel(c.ctx)
if c.cfg.DialTimeout > 0 {
ctx, cancel = context.WithTimeout(ctx, c.cfg.DialTimeout)
ctx, _ = context.WithTimeout(ctx, c.cfg.DialTimeout)
}
wg.Add(len(c.cfg.Endpoints))
for _, ep := range c.cfg.Endpoints {
@ -466,7 +440,7 @@ func (c *Client) checkVersion() (err error) {
vs := strings.Split(resp.Version, ".")
maj, min := 0, 0
if len(vs) >= 2 {
maj, _ = strconv.Atoi(vs[0])
maj, rerr = strconv.Atoi(vs[0])
min, rerr = strconv.Atoi(vs[1])
}
if maj < 3 || (maj == 3 && min < 2) {
@ -498,14 +472,14 @@ func isHaltErr(ctx context.Context, err error) bool {
if err == nil {
return false
}
ev, _ := status.FromError(err)
code := grpc.Code(err)
// Unavailable codes mean the system will be right back.
// (e.g., can't connect, lost leader)
// Treat Internal codes as if something failed, leaving the
// system in an inconsistent state, but retrying could make progress.
// (e.g., failed in middle of send, corrupted frame)
// TODO: are permanent Internal errors possible from grpc?
return ev.Code() != codes.Unavailable && ev.Code() != codes.Internal
return code != codes.Unavailable && code != codes.Internal
}
func toErr(ctx context.Context, err error) error {
@ -516,8 +490,7 @@ func toErr(ctx context.Context, err error) error {
if _, ok := err.(rpctypes.EtcdError); ok {
return err
}
ev, _ := status.FromError(err)
code := ev.Code()
code := grpc.Code(err)
switch code {
case codes.DeadlineExceeded:
fallthrough
@ -526,6 +499,7 @@ func toErr(ctx context.Context, err error) error {
err = ctx.Err()
}
case codes.Unavailable:
err = ErrNoAvailableEndpoints
case codes.FailedPrecondition:
err = grpc.ErrClientConnClosing
}

View File

@ -22,8 +22,8 @@ import (
"github.com/coreos/etcd/etcdserver/api/v3rpc/rpctypes"
"github.com/coreos/etcd/pkg/testutil"
"golang.org/x/net/context"
"google.golang.org/grpc"
)
func TestDialCancel(t *testing.T) {
@ -45,7 +45,7 @@ func TestDialCancel(t *testing.T) {
t.Fatal(err)
}
// connect to ipv4 black hole so dial blocks
// connect to ipv4 blackhole so dial blocks
c.SetEndpoints("http://254.0.0.1:12345")
// issue Get to force redial attempts
@ -97,7 +97,7 @@ func TestDialTimeout(t *testing.T) {
for i, cfg := range testCfgs {
donec := make(chan error)
go func() {
// without timeout, dial continues forever on ipv4 black hole
// without timeout, dial continues forever on ipv4 blackhole
c, err := New(cfg)
if c != nil || err == nil {
t.Errorf("#%d: new client should fail", i)
@ -117,8 +117,8 @@ func TestDialTimeout(t *testing.T) {
case <-time.After(5 * time.Second):
t.Errorf("#%d: failed to timeout dial on time", i)
case err := <-donec:
if err != context.DeadlineExceeded {
t.Errorf("#%d: unexpected error %v, want %v", i, err, context.DeadlineExceeded)
if err != grpc.ErrClientConnTimeout {
t.Errorf("#%d: unexpected error %v, want %v", i, err, grpc.ErrClientConnTimeout)
}
}
}

View File

@ -15,12 +15,11 @@
package clientv3util_test
import (
"context"
"log"
"github.com/coreos/etcd/clientv3"
"github.com/coreos/etcd/clientv3/clientv3util"
"golang.org/x/net/context"
)
func ExampleKeyExists_put() {
@ -34,7 +33,7 @@ func ExampleKeyExists_put() {
kvc := clientv3.NewKV(cli)
// perform a put only if key is missing
// It is useful to do the check atomically to avoid overwriting
// It is useful to do the check (transactionally) to avoid overwriting
// the existing key which would generate potentially unwanted events,
// unless of course you wanted to do an overwrite no matter what.
_, err = kvc.Txn(context.Background()).

View File

@ -16,8 +16,8 @@ package clientv3
import (
pb "github.com/coreos/etcd/etcdserver/etcdserverpb"
"golang.org/x/net/context"
"google.golang.org/grpc"
)
type (
@ -74,19 +74,27 @@ func (c *cluster) MemberRemove(ctx context.Context, id uint64) (*MemberRemoveRes
func (c *cluster) MemberUpdate(ctx context.Context, id uint64, peerAddrs []string) (*MemberUpdateResponse, error) {
// it is safe to retry on update.
r := &pb.MemberUpdateRequest{ID: id, PeerURLs: peerAddrs}
resp, err := c.remote.MemberUpdate(ctx, r)
if err == nil {
return (*MemberUpdateResponse)(resp), nil
for {
r := &pb.MemberUpdateRequest{ID: id, PeerURLs: peerAddrs}
resp, err := c.remote.MemberUpdate(ctx, r, grpc.FailFast(false))
if err == nil {
return (*MemberUpdateResponse)(resp), nil
}
if isHaltErr(ctx, err) {
return nil, toErr(ctx, err)
}
}
return nil, toErr(ctx, err)
}
func (c *cluster) MemberList(ctx context.Context) (*MemberListResponse, error) {
// it is safe to retry on list.
resp, err := c.remote.MemberList(ctx, &pb.MemberListRequest{})
if err == nil {
return (*MemberListResponse)(resp), nil
for {
resp, err := c.remote.MemberList(ctx, &pb.MemberListRequest{}, grpc.FailFast(false))
if err == nil {
return (*MemberListResponse)(resp), nil
}
if isHaltErr(ctx, err) {
return nil, toErr(ctx, err)
}
}
return nil, toErr(ctx, err)
}

View File

@ -44,8 +44,10 @@ func (op CompactOp) toRequest() *pb.CompactionRequest {
return &pb.CompactionRequest{Revision: op.revision, Physical: op.physical}
}
// WithCompactPhysical makes Compact wait until all compacted entries are
// removed from the etcd server's storage.
// WithCompactPhysical makes compact RPC call wait until
// the compaction is physically applied to the local database
// such that compacted entries are totally removed from the
// backend database.
func WithCompactPhysical() CompactOption {
return func(op *CompactOp) { op.physical = true }
}

View File

@ -99,7 +99,6 @@ func (cmp *Cmp) ValueBytes() []byte {
// WithValueBytes sets the byte slice for the comparison's value.
func (cmp *Cmp) WithValueBytes(v []byte) { cmp.TargetUnion.(*pb.Compare_Value).Value = v }
// mustInt64 panics if val isn't an int or int64. It returns an int64 otherwise.
func mustInt64(val interface{}) int64 {
if v, ok := val.(int64); ok {
return v
@ -109,12 +108,3 @@ func mustInt64(val interface{}) int64 {
}
panic("bad value")
}
// mustInt64orLeaseID panics if val isn't a LeaseID, int or int64. It returns an
// int64 otherwise.
func mustInt64orLeaseID(val interface{}) int64 {
if v, ok := val.(LeaseID); ok {
return int64(v)
}
return mustInt64(val)
}

View File

@ -21,7 +21,6 @@ import (
v3 "github.com/coreos/etcd/clientv3"
pb "github.com/coreos/etcd/etcdserver/etcdserverpb"
"github.com/coreos/etcd/mvcc/mvccpb"
"golang.org/x/net/context"
)
@ -186,12 +185,12 @@ func (e *Election) observe(ctx context.Context, ch chan<- v3.GetResponse) {
cancel()
return
}
// only accept puts; a delete will make observe() spin
// only accept PUTs; a DELETE will make observe() spin
for _, ev := range wr.Events {
if ev.Type == mvccpb.PUT {
hdr, kv = &wr.Header, ev.Kv
// may have multiple revs; hdr.rev = the last rev
// set to kv's rev in case batch has multiple Puts
// set to kv's rev in case batch has multiple PUTs
hdr.Revision = kv.ModRevision
break
}
@ -214,7 +213,6 @@ func (e *Election) observe(ctx context.Context, ch chan<- v3.GetResponse) {
for !keyDeleted {
wr, ok := <-wch
if !ok {
cancel()
return
}
for _, ev := range wr.Events {
@ -227,7 +225,6 @@ func (e *Election) observe(ctx context.Context, ch chan<- v3.GetResponse) {
select {
case ch <- *resp:
case <-cctx.Done():
cancel()
return
}
}
@ -243,4 +240,4 @@ func (e *Election) Key() string { return e.leaderKey }
func (e *Election) Rev() int64 { return e.leaderRev }
// Header is the response header from the last successful election proposal.
func (e *Election) Header() *pb.ResponseHeader { return e.hdr }
func (m *Election) Header() *pb.ResponseHeader { return m.hdr }

View File

@ -20,7 +20,6 @@ import (
v3 "github.com/coreos/etcd/clientv3"
pb "github.com/coreos/etcd/etcdserver/etcdserverpb"
"github.com/coreos/etcd/mvcc/mvccpb"
"golang.org/x/net/context"
)

View File

@ -20,7 +20,6 @@ import (
v3 "github.com/coreos/etcd/clientv3"
pb "github.com/coreos/etcd/etcdserver/etcdserverpb"
"golang.org/x/net/context"
)

View File

@ -18,7 +18,6 @@ import (
"time"
v3 "github.com/coreos/etcd/clientv3"
"golang.org/x/net/context"
)
@ -54,7 +53,6 @@ func NewSession(client *v3.Client, opts ...SessionOption) (*Session, error) {
ctx, cancel := context.WithCancel(ops.ctx)
keepAlive, err := client.KeepAlive(ctx, id)
if err != nil || keepAlive == nil {
cancel()
return nil, err
}

View File

@ -18,7 +18,6 @@ import (
"math"
v3 "github.com/coreos/etcd/clientv3"
"golang.org/x/net/context"
)
@ -47,7 +46,7 @@ const (
// SerializableSnapshot provides serializable isolation and also checks
// for write conflicts.
SerializableSnapshot Isolation = iota
// Serializable reads within the same transaction attempt return data
// Serializable reads within the same transactiona attempt return data
// from the at the revision of the first read.
Serializable
// RepeatableReads reads within the same transaction attempt always
@ -86,7 +85,7 @@ func WithPrefetch(keys ...string) stmOption {
return func(so *stmOptions) { so.prefetch = append(so.prefetch, keys...) }
}
// NewSTM initiates a new STM instance, using serializable snapshot isolation by default.
// NewSTM initiates a new STM instance, using snapshot isolation by default.
func NewSTM(c *v3.Client, apply func(STM) error, so ...stmOption) (*v3.TxnResponse, error) {
opts := &stmOptions{ctx: c.Ctx()}
for _, f := range so {
@ -194,12 +193,11 @@ func (rs readSet) add(keys []string, txnresp *v3.TxnResponse) {
}
}
// first returns the store revision from the first fetch
func (rs readSet) first() int64 {
ret := int64(math.MaxInt64 - 1)
for _, resp := range rs {
if rev := resp.Header.Revision; rev < ret {
ret = rev
if len(resp.Kvs) > 0 && resp.Kvs[0].ModRevision < ret {
ret = resp.Kvs[0].ModRevision
}
}
return ret

View File

@ -33,18 +33,10 @@ type Config struct {
// DialTimeout is the timeout for failing to establish a connection.
DialTimeout time.Duration `json:"dial-timeout"`
// DialKeepAliveTime is the time in seconds after which client pings the server to see if
// transport is alive.
DialKeepAliveTime time.Duration `json:"dial-keep-alive-time"`
// DialKeepAliveTimeout is the time in seconds that the client waits for a response for the
// keep-alive probe. If the response is not received in this time, the connection is closed.
DialKeepAliveTimeout time.Duration `json:"dial-keep-alive-timeout"`
// TLS holds the client secure credentials, if any.
TLS *tls.Config
// Username is a user name for authentication.
// Username is a username for authentication.
Username string `json:"username"`
// Password is a password for authentication.

View File

@ -28,7 +28,7 @@
// Make sure to close the client after using it. If the client is not closed, the
// connection will have leaky goroutines.
//
// To specify a client request timeout, wrap the context with context.WithTimeout:
// To specify client request timeout, pass context.WithTimeout to APIs:
//
// ctx, cancel := context.WithTimeout(context.Background(), timeout)
// resp, err := kvc.Put(ctx, "sample_key", "sample_value")

View File

@ -0,0 +1,113 @@
// Copyright 2016 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package clientv3_test
import (
"fmt"
"log"
"github.com/coreos/etcd/clientv3"
"golang.org/x/net/context"
)
func ExampleAuth() {
cli, err := clientv3.New(clientv3.Config{
Endpoints: endpoints,
DialTimeout: dialTimeout,
})
if err != nil {
log.Fatal(err)
}
defer cli.Close()
if _, err = cli.RoleAdd(context.TODO(), "root"); err != nil {
log.Fatal(err)
}
if _, err = cli.UserAdd(context.TODO(), "root", "123"); err != nil {
log.Fatal(err)
}
if _, err = cli.UserGrantRole(context.TODO(), "root", "root"); err != nil {
log.Fatal(err)
}
if _, err = cli.RoleAdd(context.TODO(), "r"); err != nil {
log.Fatal(err)
}
if _, err = cli.RoleGrantPermission(
context.TODO(),
"r", // role name
"foo", // key
"zoo", // range end
clientv3.PermissionType(clientv3.PermReadWrite),
); err != nil {
log.Fatal(err)
}
if _, err = cli.UserAdd(context.TODO(), "u", "123"); err != nil {
log.Fatal(err)
}
if _, err = cli.UserGrantRole(context.TODO(), "u", "r"); err != nil {
log.Fatal(err)
}
if _, err = cli.AuthEnable(context.TODO()); err != nil {
log.Fatal(err)
}
cliAuth, err := clientv3.New(clientv3.Config{
Endpoints: endpoints,
DialTimeout: dialTimeout,
Username: "u",
Password: "123",
})
if err != nil {
log.Fatal(err)
}
defer cliAuth.Close()
if _, err = cliAuth.Put(context.TODO(), "foo1", "bar"); err != nil {
log.Fatal(err)
}
_, err = cliAuth.Txn(context.TODO()).
If(clientv3.Compare(clientv3.Value("zoo1"), ">", "abc")).
Then(clientv3.OpPut("zoo1", "XYZ")).
Else(clientv3.OpPut("zoo1", "ABC")).
Commit()
fmt.Println(err)
// now check the permission with the root account
rootCli, err := clientv3.New(clientv3.Config{
Endpoints: endpoints,
DialTimeout: dialTimeout,
Username: "root",
Password: "123",
})
if err != nil {
log.Fatal(err)
}
defer rootCli.Close()
resp, err := rootCli.RoleGet(context.TODO(), "r")
if err != nil {
log.Fatal(err)
}
fmt.Printf("user u permission: key %q, range end %q\n", resp.Perm[0].Key, resp.Perm[0].RangeEnd)
if _, err = rootCli.AuthDisable(context.TODO()); err != nil {
log.Fatal(err)
}
// Output: etcdserver: permission denied
// user u permission: key "foo", range end "zoo"
}

View File

@ -19,7 +19,6 @@ import (
"log"
"github.com/coreos/etcd/clientv3"
"golang.org/x/net/context"
)

View File

@ -20,7 +20,6 @@ import (
"github.com/coreos/etcd/clientv3"
"github.com/coreos/etcd/etcdserver/api/v3rpc/rpctypes"
"golang.org/x/net/context"
)
@ -237,11 +236,8 @@ func ExampleKV_txn() {
ctx, cancel := context.WithTimeout(context.Background(), requestTimeout)
_, err = kvc.Txn(ctx).
// txn value comparisons are lexical
If(clientv3.Compare(clientv3.Value("key"), ">", "abc")).
// the "Then" runs, since "xyz" > "abc"
Then(clientv3.OpPut("key", "XYZ")).
// the "Else" does not run
If(clientv3.Compare(clientv3.Value("key"), ">", "abc")). // txn value comparisons are lexical
Then(clientv3.OpPut("key", "XYZ")). // this runs, since 'xyz' > 'abc'
Else(clientv3.OpPut("key", "ABC")).
Commit()
cancel()

View File

@ -19,7 +19,6 @@ import (
"log"
"github.com/coreos/etcd/clientv3"
"golang.org/x/net/context"
)

View File

@ -18,8 +18,9 @@ import (
"fmt"
"log"
"github.com/coreos/etcd/clientv3"
"golang.org/x/net/context"
"github.com/coreos/etcd/clientv3"
)
func ExampleMaintenance_status() {
@ -33,15 +34,20 @@ func ExampleMaintenance_status() {
}
defer cli.Close()
resp, err := cli.Status(context.Background(), ep)
// resp, err := cli.Status(context.Background(), ep)
//
// or
//
mapi := clientv3.NewMaintenance(cli)
resp, err := mapi.Status(context.Background(), ep)
if err != nil {
log.Fatal(err)
}
fmt.Printf("endpoint: %s / Leader: %v\n", ep, resp.Header.MemberId == resp.Leader)
fmt.Printf("endpoint: %s / IsLeader: %v\n", ep, resp.Header.MemberId == resp.Leader)
}
// endpoint: localhost:2379 / Leader: false
// endpoint: localhost:22379 / Leader: false
// endpoint: localhost:32379 / Leader: true
// endpoint: localhost:2379 / IsLeader: false
// endpoint: localhost:22379 / IsLeader: false
// endpoint: localhost:32379 / IsLeader: true
}
func ExampleMaintenance_defragment() {

View File

@ -43,10 +43,10 @@ func ExampleClient_metrics() {
}
defer cli.Close()
// get a key so it shows up in the metrics as a range RPC
// get a key so it shows up in the metrics as a range rpc
cli.Get(context.TODO(), "test_key")
// listen for all Prometheus metrics
// listen for all prometheus metrics
ln, err := net.Listen("tcp", ":0")
if err != nil {
log.Fatal(err)
@ -61,7 +61,7 @@ func ExampleClient_metrics() {
<-donec
}()
// make an http request to fetch all Prometheus metrics
// make an http request to fetch all prometheus metrics
url := "http://" + ln.Addr().String() + "/metrics"
resp, err := http.Get(url)
if err != nil {
@ -80,6 +80,5 @@ func ExampleClient_metrics() {
break
}
}
// Output:
// grpc_client_started_total{grpc_method="Range",grpc_service="etcdserverpb.KV",grpc_type="unary"} 1
// Output: grpc_client_started_total{grpc_method="Range",grpc_service="etcdserverpb.KV",grpc_type="unary"} 1
}

View File

@ -16,14 +16,12 @@ package clientv3_test
import (
"log"
"os"
"time"
"github.com/coreos/etcd/clientv3"
"github.com/coreos/etcd/pkg/transport"
"github.com/coreos/pkg/capnslog"
"golang.org/x/net/context"
"google.golang.org/grpc/grpclog"
)
var (
@ -33,7 +31,8 @@ var (
)
func Example() {
clientv3.SetLogger(grpclog.NewLoggerV2(os.Stderr, os.Stderr, os.Stderr))
var plog = capnslog.NewPackageLogger("github.com/coreos/etcd", "clientv3")
clientv3.SetLogger(plog)
cli, err := clientv3.New(clientv3.Config{
Endpoints: endpoints,

View File

@ -19,7 +19,6 @@ import (
"log"
"github.com/coreos/etcd/clientv3"
"golang.org/x/net/context"
)

Some files were not shown because too many files have changed in this diff Show More