Compare commits

...

3976 Commits

Author SHA1 Message Date
66722b1ada version: bump up to 3.2.0
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-09 10:59:09 -07:00
963339d265 rafthttp: permit very large v2 snapshots
v2 snapshots were hitting the 512MB message decode limit, causing
sending snapshots to new members to fail for being too big.
2017-06-09 10:49:51 -07:00
c87594f27c etcdserver: use same ReadView for read-only txns
A read-only txn isn't serialized by raft, but it uses a fresh
read txn for every mvcc access prior to executing its request ops.
If a write txn modifies the keys matching the read txn's comparisons,
the read txn may return inconsistent results.

To fix, use the same read-only mvcc txn for the duration of the etcd
txn. Probably gets a modest txn speedup as well since there are
fewer read txn allocations.
2017-06-09 09:50:43 -07:00
e72ad5dd2a mvcc: create TxnWrites from TxnRead with NewReadOnlyTxnWrite
Already used internally by mvcc, but needed by etcdserver txns.
2017-06-09 09:50:37 -07:00
3eb5d24cab integration: test txn comparison and concurrent put ordering 2017-06-09 09:50:30 -07:00
8b9041a938 Documentation/op-guide: do not use host network, fix indentation
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-09 09:14:21 -07:00
864ffec88c v2http: put back /v2/machines and mark as non-deprecated
This reverts commit 2bb33181b6. python-etcd
seems to depend on /v2/machines and the maintainer vanished. Plus, it is
prefixed with /v2/ so it probably can't be deprecated anyway.
2017-06-08 12:05:59 -07:00
12bc2bba36 etcdserver: add leaseExpired debugging metrics
Fix https://github.com/coreos/etcd/issues/8050.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-08 11:23:12 -07:00
3a43afce5a Documentation/op-guide: fix 'grpc_code' field in metrics
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-08 10:16:07 -07:00
0e56ea37e7 fileutil: return immediately if preallocating 0 bytes
fallocate will return EINVAL, causing zeroing to the end of a
0 byte file to fail.

Fixes #8045
2017-06-07 12:59:35 -07:00
743192aa3b *: clear rarer shellcheck errors on scripts
Clean up the tail of the warnings
2017-06-06 10:44:59 -07:00
e8b156578f travis: add shellcheck 2017-06-06 10:44:53 -07:00
61f3338ce7 test: shellcheck 2017-06-06 10:44:46 -07:00
effffdbdca test, osutil: disable setting SIG_DFL on linux if built with cov tag
Was causing etcd to terminate before finishing writing its
coverage profile.
2017-06-06 09:47:22 -07:00
9bac803bee Documentation/op-guide: fix typo in grafana.json
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-06 09:47:15 -07:00
9169ad0d7d *: fix go tool vet -all -shadow errors 2017-06-06 09:47:06 -07:00
482a7839d9 test: speedup and strengthen go vet checking
Was iterating over every file, reloading everything. Instead,
analyze the package directories. On my machine, the time for
vet checking goes from 34s to 3s. Scans more code too.
2017-06-06 09:46:54 -07:00
ba3058ca79 op-guide: document CN certs in security.md 2017-06-06 09:46:47 -07:00
0e90e504f5 scripts, Documentation: fix swagger generation
Changes to the genproto to support splitting out the grpc-gateway broke
swagger generation.
2017-06-02 11:05:21 -07:00
998fa0de76 Documentation, scripts: regen RPC docs
Was missing the new cancel_reason field. Also includes updated protodoc
sha to fix generating documentation for upcoming txn compare range patchset.
2017-06-02 10:27:49 -07:00
c273735729 op-guide: document configuration flags for gateway 2017-06-01 15:59:49 -07:00
c85f736522 mvcc: time restore in restore benchmark
This never worked.
2017-06-01 14:59:31 -07:00
a375ff172e mvcc: chunk reads for restoring
Loading all keys at once would cause etcd to use twice as much
memory than it would need to serve the keys, causing RSS to spike on
boot. Instead, load the keys into the mvcc by chunk. Uses pipelining
for some concurrency.

Fixes #7822
2017-06-01 14:59:27 -07:00
1893af9bbd integration: use unixs:// if client port configured for tls 2017-06-01 09:47:08 -07:00
b4c655677a clientv3: support unixs:// scheme
For using TLS without giving a TLSConfig to the client.
2017-06-01 09:47:03 -07:00
c2160adf1d clientv3/integration: test dialing to TLS without a TLS config times out
etcdctl was getting ctx errors from timing out trying to issue RPCs to
a TLS endpoint but without using TLS for transmission. Client should
immediately bail out with a time out error.
2017-06-01 09:46:57 -07:00
5ada311416 clientv3: use Endpoints[0] to initialize grpc creds
Dialing out without specifying TLS creds but giving https uses some
default behavior that depends on passing an endpoint with https to
Dial(), so it's not enough to completely rely on the balancer to supply
endpoints.

Fixes #8008

Also ctx-izes grpc.Dial
2017-06-01 09:46:48 -07:00
f042cd7d9c vendor: ghodss/yaml v1.0.0 2017-05-30 14:44:30 -07:00
f0a400a3a8 vendor: kr/pty v1.0.0 2017-05-30 14:44:23 -07:00
6066977280 op-guide: update performance.md
It's been a year, time to refresh with 3.2.0 data.
2017-05-30 10:16:19 -07:00
fc88eccc74 vendor: use v0.2.0 of go-semver 2017-05-30 10:15:23 -07:00
5cb28a7d83 Documentation: add 'yaml.NewConfig' change in 3.2
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-30 10:14:55 -07:00
de57e88643 Documentation: add FAQ entry for "database space exceeded" errors
Also moves miscategorized cluster id mismatch entry from "performance"
to "operation".
2017-05-26 09:13:13 -07:00
967fc70173 Merge pull request #7983 from heyitsanthony/etcdctl-lock-exec
etcdctl: support exec on lock
2017-05-25 10:26:48 -07:00
4a8d32eaa6 Merge pull request #7984 from gyuho/3.2
*: bump up test Go runtime, etcd versions before 3.2 release
2017-05-24 17:20:48 -07:00
643c2a310d etcdctl: support exec on lock
The lock command is clumsy to use from the command line, needing mkfifo,
wait, etc. Instead, make like consul and support launching a command if
one is given.
2017-05-24 16:47:00 -07:00
c3a191b38d e2e: use version.Cluster for release test
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-24 15:20:18 -07:00
83efd2c745 ROADMAP: make 'release-3.2' stable branch
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-24 14:31:43 -07:00
307331cc31 test: release tests with v3.2+
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-24 14:31:30 -07:00
2abd22a13b travis: run tests with Go 1.8.3
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-24 14:28:33 -07:00
2a4db4307f Merge pull request #7982 from heyitsanthony/watch-latency-clients
benchmark: support multiple clients/conns in watch-latency benchmark
2017-05-24 13:23:07 -07:00
ebd6e8c4b1 benchmark: support multiple clients/conns in watch-latency benchmark 2017-05-24 11:31:43 -07:00
8c1ab62bc5 Merge pull request #7975 from raoofm/patch-11
doc: modify vonage usecase, adding kubernetes and vault
2017-05-24 10:40:47 -07:00
8d2b340629 Merge pull request #7966 from heyitsanthony/close-kv-err
etcdserver: close mvcc.KV on init error path
2017-05-23 12:59:20 -07:00
0b449a24bb Merge pull request #7956 from gyuho/container-linux
Documentation: add systemd, Container Linux guide
2017-05-23 12:38:37 -07:00
a1804390b1 doc: modify usecase
adding kubernetes and vault
2017-05-23 14:57:10 -04:00
8b290c680a Documentation: add systemd, Container Linux guide
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-23 11:27:27 -07:00
c1c9a2c96c etcdserver: close mvcc.KV on init error path
Scheduled compaction will panic if KV is not stopped before
closing the backend.
2017-05-23 10:41:37 -07:00
f75e333264 Merge pull request #7958 from heyitsanthony/perm-prefix
etcdctl: improve role --prefix flag
2017-05-22 12:19:16 -07:00
378bac79e1 Merge pull request #7963 from tlossen/patch-1
documentation: fixed typo
2017-05-22 08:29:25 -07:00
20a747ea09 Documentation/learning: fixed typo
(repeated word)
2017-05-22 17:26:34 +02:00
4cd5e7ebb2 Merge pull request #7809 from mitake/auth-watch
protect watch with auth
2017-05-20 13:23:30 +09:00
881903b6d3 e2e: add a new test case for protecting watch with auth 2017-05-20 11:34:45 +09:00
939912c425 clientv3, etcdserver: support auth in Watch() 2017-05-20 11:34:45 +09:00
cbd3807b30 Merge pull request #7959 from heyitsanthony/regen-protodoc
Documentation, scripts: regenerate protobuf docs with updated protodoc
2017-05-19 15:20:44 -07:00
10b1ba7886 Documentation, scripts: regenerate protobuf docs with updated protodoc 2017-05-19 14:57:16 -07:00
2f1467cb27 etcdctl: sync README with etcdctl role command, add prefix example, fix typo
Fixes #7951
2017-05-19 13:53:46 -07:00
bd680c3302 ctlv3: add --prefix support to role revoke-permission, cleanup role flag handling 2017-05-19 13:53:46 -07:00
fd7de051a4 version: bump up to 3.2.0-rc.1+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-19 12:39:23 -07:00
9d7ed0e63a version: bump up to 3.2.0-rc.1
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-19 11:46:15 -07:00
b82ef007f5 Merge pull request #7955 from gyuho/timeout
integration: bump up 'TestV3LeaseRequireLeader' timeout to 5-sec
2017-05-18 17:11:23 -07:00
29bbcdd110 integration: bump up 'TestV3LeaseRequireLeader' timeout to 5-sec
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-18 16:44:57 -07:00
0afc51c762 Merge pull request #7939 from gyuho/test
etcd-tester: add '-failpoints' to configure gofail
2017-05-18 12:53:07 -07:00
4a8fbb9d5d Merge pull request #7954 from gyuho/m
*: remove unused, fix typos
2017-05-18 12:36:24 -07:00
d690634bd6 *: remove unused, fix typos
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-18 12:11:18 -07:00
62b44a85f8 etcd-tester: add '-failpoints' to configure gofail
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-18 11:59:07 -07:00
e7d705b25f Merge pull request #7953 from gyuho/aaa
etcd-tester: use 'debugutil.PProfHandlers'
2017-05-18 11:26:40 -07:00
e1640cc72f etcd-tester: use 'debugutil.PProfHandlers'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-18 11:21:24 -07:00
a6a1eb8378 Merge pull request #7949 from heyitsanthony/godocs
*: fill out missing package godocs
2017-05-18 10:23:26 -07:00
33c375dc44 *: fill out blank package godocs
Mostly one-liner short descriptions, but also includes some typo fixes
and some examples.
2017-05-18 09:41:13 -07:00
1f2dcbb935 Merge pull request #7948 from heyitsanthony/remove-proxy-alpha
op-guide: remove alpha from grpc proxy
2017-05-18 09:31:34 -07:00
c6cf88ef7f op-guide: remove alpha from grpc proxy 2017-05-17 22:27:06 -07:00
4e84bd2e3c Merge pull request #7946 from heyitsanthony/report-weighted
report: add NewWeightedReport
2017-05-17 21:04:53 -07:00
c09f0ca9d4 report: add NewWeightedReport
Reports with weighted results.
2017-05-17 16:07:20 -07:00
218ee40f11 Merge pull request #7945 from xiang90/snapshot_error
etcdserver: more logging on snapshot close path
2017-05-17 15:36:53 -07:00
32c252f003 etcdserver: more logging on snapshot close path 2017-05-17 14:48:52 -07:00
f4641accc3 Merge pull request #7943 from heyitsanthony/tcpproxy-init-msg
tcpproxy: display endpoints, not pointers, in ready to proxy string
2017-05-17 12:20:46 -07:00
b7cda38653 Merge pull request #7935 from heyitsanthony/bridge-latency
bridge: add tx-delay and rx-delay
2017-05-17 11:07:22 -07:00
5bd9b9614f tcpproxy: display endpoints, not pointers, in ready to proxy string
The switch to *net.SRV for endpoints caused the ready string to emit
pointers instead of endpoint strings.

Fixes #7942
2017-05-17 10:51:35 -07:00
201fd70afc Merge pull request #7934 from heyitsanthony/bench-rpc-mutex
benchmark: add rpc mutexes to stm benchmark
2017-05-17 10:44:00 -07:00
1763f7d4d1 Merge pull request #7919 from gyuho/log-dir
functional-tester: use log-dir as data-dir in etcd-agent
2017-05-16 13:46:57 -07:00
271785cd55 Merge pull request #7937 from heyitsanthony/e2e-close-timeout
e2e: Stop() lock/elect etcdctl process if Close times out
2017-05-16 12:34:36 -07:00
8f0d4092c3 e2e: Stop() lock/elect etcdctl process if Close times out
Gets backtrace by sending SIGQUIT if Close hangs after sending a SIGINT.
2017-05-16 11:31:23 -07:00
c6219a209d Merge pull request #7933 from gyuho/travis
travis: test builds in other OSes
2017-05-15 22:25:52 -07:00
22db11f876 bridge: add tx-delay and rx-delay
Injects transmit and receive latencies.
2017-05-15 17:02:27 -07:00
d826f95c77 travis: test builds in other OSes
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-15 16:55:27 -07:00
b6e4858a25 benchmark: add rate limiting to stm 2017-05-15 15:42:54 -07:00
6526097bfc benchmark: add rpc locks to stm benchmark 2017-05-15 15:42:26 -07:00
3e7feb4033 Merge pull request #7931 from gyuho/aaa
pkg/osutil: fix missing 'syscall' import
2017-05-15 14:47:46 -07:00
fba225cee5 pkg/osutil: fix missing 'syscall' import
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-15 14:11:54 -07:00
95078c296d Merge pull request #7932 from gyuho/vet
*: remove unnecessary fmt.Sprint
2017-05-15 14:01:23 -07:00
e15020055e *: remove unnecessary fmt.Sprint
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-15 13:23:31 -07:00
74fd7709ad Merge pull request #7904 from heyitsanthony/osutil-exit
osutil: force SIG_DFL before resending terminating signal
2017-05-15 12:14:37 -07:00
31e3899663 Merge pull request #7925 from heyitsanthony/fix-windows-mmap
backend: force initial mmap size to 0 for windows
2017-05-13 21:42:58 -07:00
8516d8ccc5 backend: force initial mmap size to 0 for windows
boltdb on windows allocates a file with the full mmap size even if the
db is empty. Force the initial mmap size to 0 so there's no huge initial
db file on windows.

Fixes #7910
2017-05-12 14:34:07 -07:00
6ce9aed8c5 Merge pull request #7881 from heyitsanthony/testctl-logging
e2e: more debugging output for lock and elect tests
2017-05-12 12:01:08 -07:00
7a1739a3e8 osutil: force SIG_DFL before resending terminating signal
The go runtime won't always reinstall the default signal handler on the
SIGTERM path, so it's possible the signal won't terminate the process.
Instead, force SIG_DFL for the signal.
2017-05-12 11:56:27 -07:00
5b4677b7d7 integration: reset default logging level in TestRestartRemoved 2017-05-12 10:22:29 -07:00
b9f5a00b13 e2e: more debugging output for lock and elect etcdctl tests
Meant to debug #6464 and #6934

Dumps the output from the etcd/etcdctl servers and SIGQUITs to get a
golang backtrace in case of a hanged process.
2017-05-12 10:22:29 -07:00
90893735cf Merge pull request #7917 from heyitsanthony/refactor-backend-paths
snap, etcdserver: tighten up snapshot path handling
2017-05-12 09:33:37 -07:00
2e3d27e910 functional-tester: use log-dir as data-dir in etcd-agent
Persistent data should be configured in agent side.
There is no need to specify the data-dir in tester side.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-12 08:30:46 -07:00
f337754e72 Merge pull request #7914 from fanminshi/doc_snap_warning
*: faq for snapshot warning and dynamically determining snapshotWarningTimeout
2017-05-11 16:48:12 -07:00
aa58aff18c Merge pull request #7918 from gyuho/archive-path
etcd-agent: store failure_archive in log dir
2017-05-11 16:34:43 -07:00
0bcab05465 etcd-agent: store failure_archive in log dir
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-11 16:30:04 -07:00
71d7c85b6b expect: reload DEBUG_EXPECT for each process
Lets e2e test cases selectively turn on expect debugging to get
full application output written to stdout.
2017-05-11 16:09:31 -07:00
16e92d1379 faq: explains "snapshotting is taking more..." warning 2017-05-11 15:25:44 -07:00
8468b38631 backend: dynamically set snapshotWarningTimeout based on db size 2017-05-11 15:25:35 -07:00
7a65cb5847 Merge pull request #7916 from heyitsanthony/snip-extra-doc
clientv3: remove duplicate documentation for Do()
2017-05-11 14:45:35 -07:00
f6cd4d4f5b snap, etcdserver: tighten up snapshot path handling
Computing the snapshot file path is error prone; snapshot recovery was
constructing file paths missing a path separator so the snapshot
would never be loaded. Instead, refactor the backend path handling
to use helper functions where possible.
2017-05-11 13:46:59 -07:00
63c7e9f840 clientv3: remove duplicate documentation for Do() 2017-05-11 13:25:26 -07:00
f63eb2f6a4 Merge pull request #7913 from gyuho/srv
pkg/srv: fix error checks from resolveTCPAddr
2017-05-11 12:12:01 -07:00
3505c254e1 pkg/srv: fix error checks from resolveTCPAddr
So that 'terr' can be returned later.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-11 10:53:03 -07:00
386374a6d0 Merge pull request #7908 from heyitsanthony/concurrency-proxy
grpcproxy: forward v3lock and v3election requests
2017-05-10 16:41:06 -07:00
066062a5e0 Merge pull request #7902 from fanminshi/fix_runner
etcd-runner: remove mutex on validate() and release() in global.go
2017-05-10 13:12:09 -07:00
00da3ca725 integration: add lock and election services to proxy tests 2017-05-10 13:06:27 -07:00
713e006bc6 adpater: adapters for lock and election services 2017-05-10 12:51:05 -07:00
fd01db9e60 grpcproxy, etcdmain: add lock and election services to proxy 2017-05-10 12:19:09 -07:00
b44bd6d2a9 etcd-runner: fix race on nextc 2017-05-10 11:21:17 -07:00
47f5b7c3ad Merge pull request #7876 from fanminshi/fix_7628
etcdserver: renaming db happens after snapshot persists to wal and snap files
2017-05-09 16:15:41 -07:00
87d99fe038 etcd-runner: remove mutex on validate() and release() in global.go
election runner can deadlock in atomic release().

suppose election runner has two clients A and B.
if A is a leader and B is a follower, B obtains lock
for release() and waits for A to close(nextc) which signal
next round is ready. However, A can only close(nextc) if it
obtains lock for release(); hence deadlock.

this pr removes atomicity of validate() and release() in global.go
and gives the responsibility of locking to each runner.

FIXES #7891
2017-05-09 15:38:13 -07:00
dfdaf082c5 etcdserver: add a test to ensure renaming db happens before persisting wal and snap files 2017-05-09 14:00:22 -07:00
8b7b7222dd etcdserver: renaming db happens after snapshot persists to wal and snap files
In the case that follower recieves a snapshot from leader
and crashes before renaming xxx.snap.db to db but after
snapshot has persisted to .wal and .snap, restarting
follower results loading old db, new .wal, and new .snap.
This will causes a index mismatch between snap metadata index
and consistent index from db.

This pr forces an ordering where saving/renaming db must
happen after snapshot is persisted to wal and snap file.
this guarantees wal and snap files are newer than db.
on server restart, etcd server checks if snap index > db consistent index.
if yes, etcd server attempts to load xxx.snap.db where xxx=snap index
if there is any and panic other wise.

FIXES #7628
2017-05-09 14:00:12 -07:00
a53a9e167e Merge pull request #7898 from yudai/nit_remove_dup
v3rpc: remove duplicated error case for lease.ErrLeaseNotFound
2017-05-09 12:35:31 -07:00
b8875515a4 Merge pull request #7890 from yudai/keep_ka_loop_running
clientv3: Do no stop keep alive loop by server side errors
2017-05-09 11:00:21 -07:00
01a985eda5 Merge pull request #7897 from gyuho/bom
scripts: add 'BOM' update script
2017-05-09 10:52:42 -07:00
010ffc0692 v3rpc: remove duplicated error case for lease.ErrLeaseNotFound 2017-05-08 20:09:41 -07:00
8c9f01ef53 scripts: add 'BOM' update script
Need this script when we add external dependencies.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-08 17:59:11 -07:00
aa85b0cea7 clientv3: Do no stop keep alive loop by server side errors 2017-05-08 15:47:34 -07:00
aac2292ab5 Merge pull request #7882 from heyitsanthony/srv-priority
gateway: DNS SRV priority
2017-05-08 14:17:04 -07:00
3a2e7653f2 Merge pull request #7879 from gyuho/http-server
embed: gracefully close peer handler
2017-05-08 14:00:45 -07:00
c232814003 etcdmain, tcpproxy: srv-priority policy
Adds DNS SRV weighting and priorities to gateway.

Partially addresses #4378
2017-05-08 11:35:18 -07:00
2655540481 Merge pull request #7892 from fanminshi/add_snashot_duration_metric
backend: add prometheus metric for large snapshot duration.
2017-05-08 11:22:51 -07:00
25eef5a6e4 Merge pull request #7893 from philips/readme-tagline
README: use the same tagline from github
2017-05-08 09:11:08 -07:00
7d21d6c894 embed: gracefully close peer handlers on shutdown
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-06 07:47:23 -07:00
af7d051019 Merge pull request #7885 from luedigernet/fix-TestEvent
Fix watch_test.go TestEvent
2017-05-05 23:31:59 -07:00
90af2ff302 README: use the same tagline from github
Just be consistent with the messaging and use of etcd
2017-05-05 18:07:26 -07:00
230106dd3c backend: add prometheus metric for large snapshot duration.
FIXES #7878
2017-05-05 17:27:33 -07:00
8b081ce9b3 clientv3: check IsModify
Fix watch_test.go TestEvent

Prior to This fix the isModify case of the table driven test was never checked.
2017-05-05 19:39:59 +02:00
07ad18178d pkg/srv: package for SRV utilities
Trying to decouple the v2 client from SRV code. Can't move
into discovery/ since that creates a circular dependency. So,
give up and move all the SRV code into a new package.
2017-05-05 09:27:59 -07:00
db6f45e939 Merge pull request #7830 from aaronlehmann/new-nodes-start-active
raft: Set the RecentActive flag for newly added nodes
2017-05-05 08:59:25 -07:00
1f8de1aab0 Merge pull request #7877 from fanminshi/warning_on_snapshotting
backend: print snapshotting duration warning every 30s
2017-05-04 18:03:47 -07:00
f7f30f2361 backend: print snapshotting duration warning every 30s
FIXES #7870
2017-05-04 16:41:03 -07:00
9451fa1f9c raft: Add unit test TestAddNodeCheckQuorum
This test verifies that adding a node does not cause the leader to step
down until at least one full ElectionTick cycle elapses.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2017-05-04 15:04:30 -07:00
c3b96f8a69 Merge pull request #7875 from yudai/compact_every_time
compactor: Make periodic compactor runs every hour
2017-05-04 13:24:27 -07:00
60dbad5a85 compactor: Make periodic compactor runs every hour
Closes #7868.
2017-05-04 10:32:51 -07:00
505bf8c708 Merge pull request #7864 from gyuho/doc-link-fixes
*: run 'marker' in CI
2017-05-04 09:14:06 -07:00
2e32d2142d Merge pull request #7869 from heyitsanthony/fix-lease-require-leader-test
clientv3/integration: drain keepalives before waiting for leader loss
2017-05-04 08:29:16 -07:00
282c6fd17d Documentation: remove '[]' from '[DEPRECATED]'
To make 'marker' pass the tests

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-04 08:26:01 -07:00
c2959c998f test: run 'marker' to find broken links
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-04 08:26:00 -07:00
e9a63473a0 scripts,travis: install 'marker' for CI tests
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-04 08:26:00 -07:00
7f05e220a4 Merge pull request #7874 from gyuho/scripts
integration/fixtures-expired: do not force 'rm'
2017-05-03 19:39:00 -07:00
4edbae4a91 integration/fixtures-expired: do not force 'rm'
To make gencerts.sh script safer.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 18:45:44 -07:00
3b251b0ed3 Merge pull request #7871 from gyuho/fix-doc-2
*: fix broken links in markdown
2017-05-03 16:58:38 -07:00
4203320d04 *: fix other broken links in markdown
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 16:57:44 -07:00
feb930e357 Documentation/v3: fix broken links
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 16:57:38 -07:00
e4e057f8f7 Documentation/v2: fix broken links
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 15:37:53 -07:00
9fee35b02d Merge pull request #7842 from heyitsanthony/fix-switch-race
clientv3: don't race on upc/downc/switch endpoints in balancer
2017-05-03 13:48:00 -07:00
f6d0dda187 clientv3/integration: drain keepalives before waiting for leader loss
500ms keepalive delay on proxy side causes client to sometimes send
a second keepalive since it waits more than 500ms for the first response.

Fixes #7658
2017-05-03 13:22:45 -07:00
8f40517adb integration: close proxy's lease client 2017-05-03 13:22:24 -07:00
61c5a0c6ae Merge pull request #7867 from gyuho/fix-tls-test
integration: clean up TLS reload tests, fix no-file while renaming
2017-05-03 12:43:41 -07:00
85fa594265 integration: clean up TLS reload tests, fix no-file while renaming
Fix https://github.com/coreos/etcd/issues/7865.

It is also possible to have mis-matched key file
while renaming directories.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 11:59:09 -07:00
c2d6a92b01 Merge pull request #7853 from gyuho/revert
Documentation/upgrades: revert KeepAlive interface change
2017-05-03 11:04:15 -07:00
24e85b2454 Merge pull request #7852 from heyitsanthony/revert-lease-err-ka
Revert "Merge pull request #7732 from heyitsanthony/lease-err-ka"
2017-05-03 11:03:17 -07:00
27b3bf230b Merge pull request #7863 from heyitsanthony/stm-apis
concurrency: provide old STM functions as deprecated
2017-05-03 10:19:13 -07:00
de2e959b27 Merge pull request #7856 from fanminshi/fix_consistent_index_update
etcdserver: apply() sets consistIndex for any entry type
2017-05-03 09:07:16 -07:00
31d5d610fc concurrency: provide old STM functions as deprecated
semver
2017-05-03 02:07:01 -07:00
e33b10a666 etcdserver: add a test to ensure config change also update ConsistIndex 2017-05-02 16:51:40 -07:00
61abf25859 integration: close accepted connection on stopc path
Connection pausing added another exit condition in the listener
path, causing the bridge to leak connections instead of closing
them when signalled to close. Also adds some additional Close
paranoia.

Fixes #7823
2017-05-02 16:46:43 -07:00
43e5f892f6 clientv3: don't race on upc/downc/switch endpoints in balancer
If the balancer update notification loop starts with a downed
connection and endpoints are switched while the old connection is up,
the balancer can potentially wait forever for an up connection without
refreshing the connections to reflect the current endpoints.

Instead, fetch upc/downc together, only caring about a single transition
either from down->up or up->down for each iteration

Simple way to reproduce failures: add time.Sleep(time.Second) to the
beginning of the update notification loop.
2017-05-02 16:43:24 -07:00
5533c3058a etcdserver: apply() sets consistIndex for any entry type
previously, apply() doesn't set consistIndex for EntryConfChange type.
this causes a misalignment between consistIndex and applied index
where EntryConfChange entry results setting applied index but not consistIndex.

suppose that addMember() is called and leader reflects that change.
1. applied index and consistIndex is now misaligned.
2. a new follower node joined.
3. leader sends the snapshot to follower
	where the applied index is the snapshot metadata index.
4. follower node saves the snapshot and database(includes consistIndex) from leader.
5. restarting follower loads snapshot and database.
6. follower checks snapshot metadata index(same as applied index) and database consistIndex,
	finds them don't match, and then panic.

FIXES #7834
2017-05-02 14:57:36 -07:00
72d2adca62 Merge pull request #7854 from gyuho/lease-retry
integration: ensure revoke completes before TimeToLive
2017-05-02 12:56:56 -07:00
01b6cdf13d integration: ensure revoke completes before TimeToLive
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-02 12:56:26 -07:00
24f0423088 Merge pull request #7855 from tessr/master
raft: add chain core to notable users list
2017-05-02 11:30:03 -07:00
3d504737e4 add chain core to raft users list 2017-05-02 11:23:25 -07:00
bb42ba5f4e Documentation/upgrades: revert KeepAlive interface change
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-02 09:45:06 -07:00
6dd8fb6f24 Revert "Merge pull request #7732 from heyitsanthony/lease-err-ka"
This reverts commit fbbc4a4979, reversing
changes made to f254e38385.

Fixes #7851
2017-05-02 09:36:16 -07:00
fdf445b5a0 Merge pull request #7848 from gyuho/close-grpcc
embed: fix blocking Close before gRPC server start
2017-05-01 18:44:20 -07:00
f065d8e258 Merge pull request #7845 from heyitsanthony/single-node-docker
Documentation: add documentation for single node docker etcd
2017-05-01 16:42:19 -07:00
b0e9d24fb6 embed: fix blocking Close before gRPC server start
If 'StartEtcd' returns before starting gRPC server
(e.g. mismatch snapshot, misconfiguration),
receiving from grpcServerC blocks forever. This patch
just closes the channel to not block on grpcServerC,
and proceeds to next stop operations in Close.

This was masking the issues in https://github.com/coreos/etcd/issues/7834

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-01 16:41:13 -07:00
b1720b779c Merge pull request #7846 from heyitsanthony/build-aci-annotate
scripts: annotate with acbuild with supports-systemd-notify
2017-05-01 16:04:03 -07:00
6c1ce697a6 scripts: annotate with acbuild with supports-systemd-notify
Fixes #7840
2017-05-01 12:59:08 -07:00
3f1f5e5215 Merge pull request #7844 from heyitsanthony/v2-docker-tag
Documentation/v2: pin docker guide to use latest 2.3.x
2017-05-01 12:54:03 -07:00
b8f08d400d Documentation: add documentation for single node docker etcd
Fixes #7843
2017-05-01 12:36:16 -07:00
066f9bf7e3 Documentation/v2: pin docker guide to use latest 2.3.x 2017-05-01 11:46:39 -07:00
f0ca65a95d version: bump up to 3.2.0-rc.0+git 2017-04-28 11:06:53 -07:00
7e6d876385 version: bump up to 3.2.0-rc.0 2017-04-28 10:09:39 -07:00
7239249155 Merge pull request #7837 from gyuho/tls-errors
integration: match more TLS errors for wrong certs
2017-04-28 10:08:34 -07:00
cfeab9324e integration: match more TLS errors for wrong certs
Fix https://github.com/coreos/etcd/issues/7835.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-28 10:03:29 -07:00
77fd369b1c Merge pull request #7832 from gyuho/doc-for-3.2
Documentation: add upgrade to 3.2 doc
2017-04-27 21:27:26 -07:00
cbd7ef4ee6 Documentation: add upgrade to 3.2 doc
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-27 14:39:42 -07:00
747993de08 Merge pull request #7829 from gyuho/certs
pkg/transport: reload TLS certificates for every client requests
2017-04-27 14:36:53 -07:00
96d6f05391 Merge pull request #7831 from gyuho/cc
pkg/wait: add comment and make List private
2017-04-27 13:45:25 -07:00
22943e7e06 integration: test TLS reload
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-27 13:32:09 -07:00
d818ef2c76 pkg/wait: add comment and make List private 2017-04-27 13:25:02 -07:00
4e21f87e3d pkg/transport: reload TLS certificates for every client requests
This changes the baseConfig used when creating tls Configs to utilize
the GetCertificate and GetClientCertificate functions to always reload
the certificates from disk whenever they are needed.

Always reloading the certificates allows changing the certificates via
an external process without interrupting etcd.

Fixes #7576

Cherry-picked by Gyu-Ho Lee <gyuhox@gmail.com>
Original commit can be found at https://github.com/coreos/etcd/pull/7784
2017-04-27 11:22:03 -07:00
52613b262b raft: Set the RecentActive flag for newly added nodes
I found that enabling the CheckQuorum flag led to spurious leader
elections when new nodes joined. It looks like in the time between a new
node joining the cluster, and that node first communicating with the
leader, the quorum check could fail because the new node looks inactive.
To solve this, set the RecentActive flag when nodes are first added.
This gives a grace period for the node to communicate before it causes
the quorum check to fail.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
2017-04-27 11:19:29 -07:00
c309d745a6 Merge pull request #7819 from heyitsanthony/fix-elect-compact
concurrency: use current revisions for election
2017-04-27 11:01:44 -07:00
2a3229c00a Merge pull request #7808 from heyitsanthony/auto-bom
CI BOM checking
2017-04-27 09:24:59 -07:00
3e7bd47cd5 travis: add bill-of-materials checking
Fixes #7780
2017-04-26 16:29:48 -07:00
2059c8e9e7 vendor: revendor speakeasy to include unix license file
updates BOM
2017-04-26 16:29:48 -07:00
b77de97136 test: bill of materials check pass 2017-04-26 16:29:47 -07:00
633a0a847b Merge pull request #7824 from gyuho/certs
*: test expired certs in client
2017-04-26 13:31:17 -07:00
f674a1b583 clientv3/integration: test client dial with expired certs
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-26 12:32:46 -07:00
7cb860a31b integration/fixtures: add expired certs
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-26 12:22:54 -07:00
d2e69b339f Merge pull request #7816 from heyitsanthony/v3client-blankctx
v3client: wrap watch ctxs with blank ctx
2017-04-25 21:53:14 -07:00
41e77c9db6 Merge pull request #7818 from gyuho/doc
Documentation: require Go 1.8+ for build
2017-04-25 21:46:07 -07:00
50f29bd661 concurrency: use current revisions for election
Watching from the leader's ModRevision could cause live-locking on
observe retry loops when the ModRevision is less than the compacted
revision. Instead, start watching the leader from at least the store
revision of the linearized read used to detect the current leader.

Fixes #7815
2017-04-25 20:15:50 -07:00
6486be673b integration: test Observe can read leaders set prior to compaction 2017-04-25 20:03:49 -07:00
4959663f90 Documentation: require Go 1.8+ for build
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-25 17:04:54 -07:00
c49a87bd04 Merge pull request #7672 from fanminshi/integrate_runner_to_tester
etcd-tester: integrate etcd runner into etcd tester
2017-04-25 15:22:29 -07:00
60b9adc267 Merge pull request #7812 from fanminshi/refactor_runner
etcd-runner: fix runner and minor refactoring.
2017-04-25 15:21:57 -07:00
3ce31acda4 v3client: wrap watch ctxs with blank ctx
Printing the values in ctx.String() will data race if the value
is mutable and doesn't implement String(), which seems to be common.
Instead, just return a fixed string instead of computing it; v3client
watches don't need as much flexibility for creating separate strings,
so separate ctx strings probably aren't necessary at this point.

Fixes #7811
2017-04-25 15:03:06 -07:00
96aaeee4f5 Merge pull request #7814 from gyuho/aaa
etcdserver: do not block on raft stopping
2017-04-25 15:00:06 -07:00
a9e04061b1 etcd-runner: integrate etcd runner in to etcd tester
etcd tester runs etcd runner as a separate binary.
it signals sigstop to the runner when tester wants to stop stressing.
it signals sigcont to the runner when tester wants to start stressing.
when tester needs to clean up, it signals sigint to runner.

FIXES #7026
2017-04-25 14:53:23 -07:00
77fbe10dfc etcd-runner: add --prefix flag, allows inf round, and minor vars refactoring in watch runner. 2017-04-25 14:18:42 -07:00
debc69e1f2 etcd-runner: pass in lock name as a command arg for lock_racer. 2017-04-25 14:18:42 -07:00
72fb756af3 etcd-runner: add lease ttl as a flag and fatal when err in lease-runner. 2017-04-25 14:18:42 -07:00
d57ad8ec8d etcd-runner: add barrier, observe !ok handling, and election name arg to election-runner. 2017-04-25 14:17:59 -07:00
fa85445ef8 etcd-runner: add rate limiting in doRounds() 2017-04-25 14:00:52 -07:00
327f09fcb4 etcdserver: do not block on raft stopping
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-25 13:35:43 -07:00
2af1605db3 Merge pull request #7810 from gyuho/sync-with-apply
etcdserver: ensure waitForApply sync with applyAll
2017-04-25 13:21:30 -07:00
91f6aee4f2 etcdserver: ensure waitForApply sync with applyAll
Problem is:

`Step1`: `etcdserver/raft.go`'s `Ready` process routine sends config-change entries via `r.applyc <- ap` (https://github.com/coreos/etcd/blob/master/etcdserver/raft.go#L193-L203)

`Step2`: `etcdserver/server.go`'s `*EtcdServer.run` routine receives this via `ap := <-s.r.apply()` (https://github.com/coreos/etcd/blob/master/etcdserver/server.go#L735-L738)

`StepA`: `Step1` proceeds without sync, right after sending `r.applyc <- ap`.

`StepB`: `Step2` proceeds without sync, right after `sched.Schedule(s.applyAll(&ep,&ap))`.

`StepC`: `etcdserver` tries to sync with `s.applyAll(&ep,&ap)` by calling `rh.waitForApply()`.

`rh.waitForApply()` waits for all pending jobs to finish in `pkg/schedule`
side. However, the order of `StepA`,`StepB`,`StepC` is not guaranteed. It
is possible that `StepC` happens first, and proceeds without waiting on
apply. And the restarting member comes back as a leader in single-node
cluster, when there is no synchronization between apply-layer and
config-change Raft entry apply. Confirmed with more debugging lines below,
only reproducible with slow CPU VM (~2 vCPU).

```
~:24.005397 I | etcdserver: starting server... [version: 3.2.0+git, cluster version: to_be_decided]
~:24.011136 I | etcdserver: [DEBUG] 29b2d24047a277df waitForApply before
~:24.011194 I | etcdserver: [DEBUG] 29b2d24047a277df starts wait for 0 pending jobs
~:24.011234 I | etcdserver: [DEBUG] 29b2d24047a277df finished wait for 0 pending jobs (current pending 0)
~:24.011268 I | etcdserver: [DEBUG] 29b2d24047a277df waitForApply after
~:24.011348 I | etcdserver: [DEBUG] [0] 29b2d24047a277df is scheduling conf change on 29b2d24047a277df
~:24.011396 I | etcdserver: [DEBUG] [1] 29b2d24047a277df is scheduling conf change on 5edf80e32a334cf0
~:24.011437 I | etcdserver: [DEBUG] [2] 29b2d24047a277df is scheduling conf change on e32e31e76c8d2678
~:24.011477 I | etcdserver: [DEBUG] 29b2d24047a277df scheduled conf change on 29b2d24047a277df
~:24.011509 I | etcdserver: [DEBUG] 29b2d24047a277df scheduled conf change on 5edf80e32a334cf0
~:24.011545 I | etcdserver: [DEBUG] 29b2d24047a277df scheduled conf change on e32e31e76c8d2678
~:24.012500 I | etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 29b2d24047a277df before
~:24.013014 I | etcdserver/membership: added member 29b2d24047a277df [unix://127.0.0.1:2100515039] to cluster 9250d4ae34216949
~:24.013066 I | etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 29b2d24047a277df after
~:24.013113 I | etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 29b2d24047a277df after trigger
~:24.013158 I | etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 5edf80e32a334cf0 before
~:24.013666 W | etcdserver: failed to send out heartbeat on time (exceeded the 10ms timeout for 11.964739ms)
~:24.013709 W | etcdserver: server is likely overloaded
~:24.013750 W | etcdserver: failed to send out heartbeat on time (exceeded the 10ms timeout for 12.057265ms)
~:24.013775 W | etcdserver: server is likely overloaded
~:24.013950 I | raft: 29b2d24047a277df is starting a new election at term 4
~:24.014012 I | raft: 29b2d24047a277df became candidate at term 5
~:24.014051 I | raft: 29b2d24047a277df received MsgVoteResp from 29b2d24047a277df at term 5
~:24.014107 I | raft: 29b2d24047a277df became leader at term 5
~:24.014146 I | raft: raft.node: 29b2d24047a277df elected leader 29b2d24047a277df at term 5
```

I am printing out the number of pending jobs before we call
`sched.WaitFinish(0)`, and there was no pending jobs, so it returned
immediately (before we schedule `applyAll`).

This is the root cause to:

- https://github.com/coreos/etcd/issues/7595
- https://github.com/coreos/etcd/issues/7739
- https://github.com/coreos/etcd/issues/7802

`sched.WaitFinish(0)` doesn't work when `len(f.pendings)==0` and
`f.finished==0`. Config-change is the first job to apply, so
`f.finished` is 0 in this case.

`f.finished` monotonically increases, so we need `WaitFinish(finished+1)`.
And `finished` must be the one before calling `Schedule`. This is safe
because `Schedule(applyAll)` is the only place adding jobs to `sched`.
Then scheduler waits on the single job of `applyAll`, by getting the
current number of finished jobs before sending `Schedule`.

Or just make it be blocked until `applyAll` routine triggers on the
config-change job. This patch just removes `waitForApply`, and
signal `raftDone` to wait until `applyAll` finishes applying entries.

Confirmed that it fixes the issue, as below:

```
~:43.198354 I | rafthttp: started streaming with peer 36cda5222aba364b (stream MsgApp v2 reader)
~:43.198740 I | etcdserver: [DEBUG] 3988bc20c2b2e40c waitForApply before
~:43.198836 I | etcdserver: [DEBUG] 3988bc20c2b2e40c starts wait for 0 pending jobs, 1 finished jobs
~:43.200696 I | integration: launched 3169361310155633349 ()
~:43.201784 I | etcdserver: [DEBUG] [0] 3988bc20c2b2e40c is scheduling conf change on 36cda5222aba364b
~:43.201884 I | etcdserver: [DEBUG] [1] 3988bc20c2b2e40c is scheduling conf change on 3988bc20c2b2e40c
~:43.201965 I | etcdserver: [DEBUG] [2] 3988bc20c2b2e40c is scheduling conf change on cf5d6cbc2a121727
~:43.202070 I | etcdserver: [DEBUG] 3988bc20c2b2e40c scheduled conf change on 36cda5222aba364b
~:43.202139 I | etcdserver: [DEBUG] 3988bc20c2b2e40c scheduled conf change on 3988bc20c2b2e40c
~:43.202204 I | etcdserver: [DEBUG] 3988bc20c2b2e40c scheduled conf change on cf5d6cbc2a121727
~:43.202444 I | etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 36cda5222aba364b (request ID: 0) before
~:43.204486 I | etcdserver/membership: added member 36cda5222aba364b [unix://127.0.0.1:2100913646] to cluster 425d73f1b7b01674
~:43.204588 I | etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 36cda5222aba364b (request ID: 0) after
~:43.204703 I | etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 36cda5222aba364b (request ID: 0) after trigger
~:43.204791 I | etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 3988bc20c2b2e40c (request ID: 0) before
~:43.205689 I | etcdserver/membership: added member 3988bc20c2b2e40c [unix://127.0.0.1:2101113646] to cluster 425d73f1b7b01674
~:43.205783 I | etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 3988bc20c2b2e40c (request ID: 0) after
~:43.205929 I | etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 3988bc20c2b2e40c (request ID: 0) after trigger
~:43.206056 I | etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on cf5d6cbc2a121727 (request ID: 0) before
~:43.207353 I | etcdserver/membership: added member cf5d6cbc2a121727 [unix://127.0.0.1:2100713646] to cluster 425d73f1b7b01674
~:43.207516 I | etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on cf5d6cbc2a121727 (request ID: 0) after
~:43.207619 I | etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on cf5d6cbc2a121727 (request ID: 0) after trigger
~:43.207710 I | etcdserver: [DEBUG] 3988bc20c2b2e40c finished scheduled conf change on 36cda5222aba364b
~:43.207781 I | etcdserver: [DEBUG] 3988bc20c2b2e40c finished scheduled conf change on 3988bc20c2b2e40c
~:43.207843 I | etcdserver: [DEBUG] 3988bc20c2b2e40c finished scheduled conf change on cf5d6cbc2a121727
~:43.207951 I | etcdserver: [DEBUG] 3988bc20c2b2e40c finished wait for 0 pending jobs (current pending 0, finished 1)
~:43.208029 I | rafthttp: started HTTP pipelining with peer cf5d6cbc2a121727
~:43.210339 I | rafthttp: peer 3988bc20c2b2e40c became active
~:43.210435 I | rafthttp: established a TCP streaming connection with peer 3988bc20c2b2e40c (stream MsgApp v2 reader)
~:43.210861 I | rafthttp: started streaming with peer 3988bc20c2b2e40c (writer)
~:43.211732 I | etcdserver: [DEBUG] 3988bc20c2b2e40c waitForApply after
```

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-25 10:22:27 -07:00
b94b8b5707 etcd-runner: move root cmd into command package
this allows easier sharing of global variable for sub commands.
2017-04-25 10:19:20 -07:00
fbbc4a4979 Merge pull request #7732 from heyitsanthony/lease-err-ka
clientv3: don't halt lease client if there is a lease error
2017-04-25 07:06:31 -07:00
2fd6df922a integration: close proxy's lease client 2017-04-24 23:49:45 -07:00
cb8524fbec benchmark: use new lease interface 2017-04-24 23:49:45 -07:00
78afc853f4 etcd-runner: update to use new lease interface 2017-04-24 23:49:45 -07:00
b5384ac1c0 grpcproxy: use new lease interface 2017-04-24 23:49:44 -07:00
70f0bbe38c etcdcdtl: use new lease interface 2017-04-24 23:49:44 -07:00
f3053265ae clientv3/integration: use new interfaces in lease tests 2017-04-24 23:49:44 -07:00
f224d74ed7 concurrency: use new lease interface in session 2017-04-24 23:49:44 -07:00
d5f414f69b clientv3: don't halt lease client if there is a lease error
Fixes #7488
2017-04-24 23:49:44 -07:00
f254e38385 Merge pull request #7806 from heyitsanthony/testutil-assert
testutil: assert functions
2017-04-23 01:30:39 -07:00
2ef3eac5ca vendor: remove testify
Fixes #7805
2017-04-22 20:29:58 -07:00
76fb6ebcbb scripts: remove testify hack in updatedep 2017-04-22 20:29:58 -07:00
978cf804ca store: replace testify asserts with testutil asserts 2017-04-22 20:29:58 -07:00
6f06e1cb47 testutil: add assert functions 2017-04-22 20:29:58 -07:00
c5d4f3e7db Merge pull request #7804 from heyitsanthony/current-watch-fix
clientv3: set current revision to create rev regardless of CreateNotify
2017-04-22 14:09:17 -07:00
7f159b6a8d Merge pull request #7803 from heyitsanthony/snip-deprecated-machines
v2http: remove deprecated /v2/machines path
2017-04-22 14:08:55 -07:00
ca4acceb1e clientv3: set current revision to create rev regardless of CreateNotify
Turns out the optimization to ignore setting the init rev for
current revision watches breaks some ordering assumptions. Since
Watch only returns a channel once it gets a response, it should
bind the revision at the time of the first create response.

Was causing TestWatchReconnInit to fail.
2017-04-22 13:04:38 -07:00
94f6a11bbf Merge pull request #7756 from heyitsanthony/weaken-v3elect-test
integration: permit dropping intermediate leader values on observe
2017-04-22 12:13:51 -07:00
c1300c81b3 concurrency: clarify Observe semantics; only fetches subsequence 2017-04-22 11:26:11 -07:00
e6a789d541 integration: permit dropping intermediate leader values on observe
Weaken TestV3ElectionObserve so it only checks that it observes a strictly
monotonically ascending leader transition sequence following the first
observed leader. First, the Observe will issue the leader channel before
getting a response for its first get; the election revision is only bound
after returning the channel. So, Observe can't be expected to always
return the leader at the time it was started.  Second, Observe fetches
the current leader based on its create revision, but begins watching on its
ModRevision; this is important so that elections still work in case the
leader issues proclamations following a compaction that exceeds its
creation revision. So, Observe can't be expected to return the entire
proclamation sequence for a single leader.

Fixes #7749
2017-04-22 11:26:11 -07:00
2bb33181b6 v2http: remove deprecated /v2/machines path 2017-04-22 03:11:21 -07:00
7da451640f Merge pull request #7795 from heyitsanthony/dont-force-initrev
clientv3: only update initReq.rev == 0 with watch revision
2017-04-22 02:50:55 -07:00
4ab818a856 clientv3: only update initReq.rev == 0 with creation watch revision
Always updating the initReq.rev on watch create will resume from the wrong
revision if initReq is ever nonzero.
2017-04-21 20:22:51 -07:00
ec470944f8 clientv3/integration: test watch resume with disconnect before first event 2017-04-21 20:22:51 -07:00
fe1ce3a2f0 integration: add pause/unpause to client bridge
Resetting connections sometimes isn't enough; need to stop/resume
accepting connections for some tests while keeping the member up.
2017-04-21 20:22:51 -07:00
91039bef7c Merge pull request #7799 from heyitsanthony/ctxize-resolve
netutil: use "context" and ctx-ize TCP addr resolution
2017-04-21 16:30:32 -07:00
a73950545a Merge pull request #7801 from heyitsanthony/s1027
*: clear redundant return statement warnings (S1027)
2017-04-21 15:18:40 -07:00
14d6ed9e5f *: clear redundant return statement warnings (S1027) 2017-04-21 14:01:00 -07:00
a9087ee659 Merge pull request #7714 from glevand/for-merge-cross
Add multi arch release support
2017-04-21 10:56:01 -07:00
bf987185a9 release.md: Update for multi arch release
Signed-off-by: Geoff Levand <geoff@infradead.org>
2017-04-21 10:04:41 -07:00
07c07cea25 release: Add multi arch support
Signed-off-by: Geoff Levand <geoff@infradead.org>
2017-04-21 10:04:41 -07:00
0c8988aa07 build-docker: Updates for multi-arch release
o Set -e to abort script if a command fails.
 o Allow custom docker 'TAG' from the environment.
 o Move arch suffix to version to allow all images to
   be put into a single repository.
 o Enable cross builds.  When doing cross builds where the
   host and target architectures are different 'RUN mkdir'
   will fail since the target container cannot be run on
   the host.  To work around this, create the directories
   in build-docker, then use ADD in the Dockerfile.
 o Add Dockerfile-release.arm64

Signed-off-by: Geoff Levand <geoff@infradead.org>
2017-04-21 10:04:41 -07:00
8309ca92d7 build-aci: Add multi arch support
Uses GOARCH to build for a targeted arch.

Usage: GOARCH=... BINARYDIR=... BUILDDIR=... ./scripts/build-aci version

Signed-off-by: Geoff Levand <geoff@infradead.org>
2017-04-21 10:04:41 -07:00
fb6287240f build-binary: Add arm64
Signed-off-by: Geoff Levand <geoff@infradead.org>
2017-04-21 10:04:41 -07:00
85e87e8f6b netutil: use "context" and ctx-ize TCP addr resolution 2017-04-21 10:01:53 -07:00
8bad78cb98 Merge pull request #7788 from gyuho/trace
vendor: use 'x/net/trace' with std 'context'
2017-04-20 18:18:33 -07:00
bfd5f38af3 vendor: use 'x/net/trace' with std 'context'
For https://github.com/coreos/etcd/issues/6174.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-20 17:28:59 -07:00
3a93928b07 Merge pull request #7779 from heyitsanthony/pkgize-gw
*: put gateway stubs in packages separate from pb stubs
2017-04-20 14:53:56 -07:00
82b7e4fd3b Merge pull request #7786 from gyuho/rate
vendor: update 'golang.org/x/time/rate' with context
2017-04-20 13:51:43 -07:00
da1bba8f39 vendor: update 'golang.org/x/time/rate' with context
Go just updated its import path c06e80d930

For https://github.com/coreos/etcd/issues/6174.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-20 11:18:34 -07:00
633a4e6b52 Merge pull request #7785 from heyitsanthony/printerize-lease
ctlv3: use printer for lease command results
2017-04-20 10:36:58 -07:00
cf8c66c9f0 ctlv3: use printer for lease command results
Fixes #7783
2017-04-20 09:41:04 -07:00
85c9ea92bb Merge pull request #7745 from heyitsanthony/bom
*: add bill of materials
2017-04-19 15:29:20 -07:00
a2b5444a26 test: ensure clientv3 has no grpc-gateway dependency 2017-04-19 13:09:23 -07:00
393e4335b7 *: put gateway stubs into their own packages
Fixes #7773
2017-04-19 13:09:06 -07:00
fd11523af9 scripts: move gateway stubs into gw/ packages 2017-04-19 12:50:04 -07:00
04fc57ac1d Merge pull request #7775 from heyitsanthony/fix-lease-print
ctlv3: keep lease as integer in fields printer
2017-04-19 09:08:17 -07:00
385e18bc6c Merge pull request #7768 from gyuho/close-serverc
embed: signal 'grpcServerC' before cmux serve
2017-04-19 08:24:22 -07:00
35dff4cbc3 Merge pull request #7769 from heyitsanthony/more-time-lease-test
clientv3/integration: sleep less in TestLeaseRenewLostQuorum
2017-04-19 00:57:49 -07:00
d24a763a12 Merge pull request #7771 from heyitsanthony/remove-2.0-version
etcdserver: remove 2.0 StatusNotFound version check
2017-04-19 00:57:19 -07:00
fcd4871e2a ctlv3: keep lease as integer in fields printer
Output was giving %!d(string=) instead of the expected lease ID
value.
2017-04-19 00:48:13 -07:00
d3456b5ecd Merge pull request #7759 from mitake/fix-7724
*: simply ignore ErrAuthNotEnabled in clientv3 if auth is not enabled
2017-04-19 16:07:18 +09:00
3d8e2e1171 etcdserver: remove 2.0 StatusNotFound version check 2017-04-18 20:22:56 -07:00
c654370d6d clientv3/integration: sleep less in TestLeaseRenewLostQuorum
Server Stop+Restart sometimes takes more than 500ms, so with a
one second window the lease client may not get a chance to issue
a keepalive and get a lease extension before the lease client
timer elapses. Instead, sleep for a shorter period of time (while
still guaranteeing a keepalive resend during quorum loss) and
skip the test if server restart takes longer than the lease TTL.

Fixes #7346
2017-04-18 19:35:20 -07:00
e1306bff8f *: simply ignore ErrAuthNotEnabled in clientv3 if auth is not enabled
Fix https://github.com/coreos/etcd/issues/7724
2017-04-19 11:27:14 +09:00
ba299bcaaf embed: signal 'grpcServerC' before cmux serve
CMux.Serve blocks, so grpcServerC was never closed.

Fix https://github.com/coreos/etcdlabs/issues/216.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-18 17:49:50 -07:00
8fa4b8da6e Merge pull request #7767 from heyitsanthony/transport-resolve-dnsnames
transport: resolve DNSNames when SAN checking
2017-04-18 17:28:01 -07:00
cb408ace21 Merge pull request #7757 from heyitsanthony/fix-speedy-close
etcdserver: initialize raftNode with constructor
2017-04-18 15:06:45 -07:00
05582ad5b2 transport: resolve DNSNames when SAN checking
The current transport client TLS checking will pass an IP address into
VerifyHostnames if there is DNSNames SAN. However, the go runtime will
not resolve the DNS names to match the client IP. Intead, resolve the
names when checking.
2017-04-18 13:21:26 -07:00
30552e28ed Merge pull request #7766 from gyuho/url
embed: use '*url.URL.Hostname(),Port()' for Go 1.8
2017-04-18 13:16:13 -07:00
f10a70401b embed: use '*url.URL.Hostname(),Port()' for Go 1.8
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-18 12:08:59 -07:00
94044cee4f Merge pull request #7765 from gyuho/mutex-profile
pkg/debugutil: add 'mutex' profiler (Go 1.8+)
2017-04-18 11:34:23 -07:00
5161b74799 pkg/debugutil: add 'mutex' profiler (Go 1.8+)
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-18 10:56:06 -07:00
dd0d590217 Merge pull request #7764 from gyuho/NEWS
NEWS: update v3.1.6
2017-04-18 10:31:13 -07:00
2511535ea0 NEWS: update v3.1.6
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-18 10:09:53 -07:00
714b48a4b4 etcdserver: initialize raftNode with constructor
raftNode was being initialized in start(), which was causing
hangs when trying to stop the etcd server since the stop channel
would not be initialized in time for the stop call. Instead,
setup non-configurable bits in a constructor.

Fixes #7668
2017-04-18 09:33:59 -07:00
8fdf8f752b Merge pull request #7752 from heyitsanthony/clientv3-fetch-keyspace-pfx
clientv3: translate WithPrefix() into WithFromKey() for empty key
2017-04-18 09:24:53 -07:00
6dd807481c Merge pull request #7758 from a-robinson/leak
raft: Avoid holding unneeded memory in unstable log's entries array
2017-04-18 08:27:40 -07:00
45406d8486 raft: Avoid holding unneeded memory in unstable log's entries array
Accumulation of old entries in the underlying array backing the
entries slice has been found to cause massive memory growth in
CockroachDB for workloads that do large (1MB) writes
(https://github.com/cockroachdb/cockroach/issues/14776)

This doesn't appear to have much consistent effect on the raft
benchmarks, although it's worth noting that they vary quite a bit
between runs so it's kind of tough to draw strong conclusions from them.
Let me know if there are any different benchmarks you'd like me to run!

Fixes #7746

benchmark              old ns/op     new ns/op     delta
BenchmarkOneNode-8     3283          3125          -4.81%

benchmark              old allocs     new allocs     delta
BenchmarkOneNode-8     6              6              +0.00%

benchmark              old bytes     new bytes     delta
BenchmarkOneNode-8     796           727           -8.67%

benchmark                     old ns/op     new ns/op     delta
BenchmarkProposal3Nodes-8     4269          4337          +1.59%

benchmark                     old allocs     new allocs     delta
BenchmarkProposal3Nodes-8     15             13             -13.33%

benchmark                     old bytes     new bytes     delta
BenchmarkProposal3Nodes-8     5839          4544          -22.18%
2017-04-18 10:55:16 -04:00
4fcea334ad Merge pull request #7737 from gyuho/aaa
*: clean up for Go 1.8+
2017-04-18 03:37:43 -07:00
8aaa1ed911 *: use '*tls.Config.Clone' in Go 1.8
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 20:08:27 -07:00
99a2d6c4b1 integration: use 'time.Until' in Go 1.8
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 20:08:27 -07:00
cbe37e5213 travis: bump up to Go 1.8.1
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 20:08:27 -07:00
e7e7451213 Merge pull request #7689 from mitake/bench-leader
benchmark: a new flag --target-leader for targetting a leader endpoint
2017-04-18 10:24:24 +09:00
e771c6042b Merge pull request #7743 from gyuho/shutdown-grpc-server
*: use gRPC server GracefulStop
2017-04-17 17:12:52 -07:00
c011e2ddd5 Merge pull request #7755 from gyuho/auth-test
clientv3/integration: add 'TestUserErrorAuth'
2017-04-17 17:12:24 -07:00
81291b23b1 clientv3/integration: add 'TestUserErrorAuth'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 17:11:37 -07:00
c798f81398 Merge pull request #7753 from gyuho/helper
etcdserver: fill-in Auth API Header in apply layer
2017-04-17 15:18:46 -07:00
8a5f085a65 *: add bill of materials 2017-04-17 14:50:55 -07:00
cb979bc2cc vendor: update gopkg.in/yaml.v2 to reflect current license 2017-04-17 14:34:59 -07:00
253e5a90bb integration: test auth API response header revision
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:26:30 -07:00
ac69e63fa8 etcdserver: fill-in Auth API Header in apply layer
Replacing "etcdserver: fill a response header in auth RPCs"
The revision should be set at the time of "apply",
not in later RPC layer.

Fix https://github.com/coreos/etcd/issues/7691

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:26:26 -07:00
5000d29b4a mvcc: remove stopc select case in Hash
Revert change in 33acbb694b.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:19:48 -07:00
8ffd58fb3b mvcc/backend: remove t.tx.DB()==nil checks with GracefulStop
Revert https://github.com/coreos/etcd/pull/6662.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:17:00 -07:00
cd470f9ccd Revert "mvcc: test inflight Hash to trigger Size on nil db"
This reverts commit 994e8e4f40.

Since now etcdserver gracefully shuts down the gRPC server
2017-04-17 14:15:43 -07:00
472a536052 integration: test 'inflight' range requests
- Test https://github.com/coreos/etcd/issues/7322.
- Remove test case added in https://github.com/coreos/etcd/pull/6662.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:15:36 -07:00
c407e097e2 embed: gracefully shut down gRPC server
Fix https://github.com/coreos/etcd/issues/7322.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:12:40 -07:00
ea5f6dab6b etcdmain: trigger embed.Etcd.Close for OS interrupt
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:07:16 -07:00
0d52598fc1 Merge pull request #7754 from heyitsanthony/doc-check-v3-data
op-guide: add command for checking if there's any v3 data
2017-04-17 14:05:57 -07:00
cf8ab8c7a6 op-guide: add command for checking if there's any v3 data
Fixes #7681
2017-04-17 12:31:21 -07:00
6b030ed7db benchmark: a new flag --target-leader for targetting a leader endpoint
Current benchmark picks destinations of RPCs in a random
manner. However, it will result divergent benchmarking result because
RPCs other than serializable range must be forwarded to a leader node
when a follower node receives it. This commit adds a new flag
--target-leader for avoid the problem. If the flag is passed,
benchmark always picks an endpoint of a leader node.
2017-04-17 14:24:35 +09:00
6ad9d1609a Merge pull request #7717 from mitake/auth-output-fields
etcdctl: show responses of auth RPCs if --write-output=fields is passed
2017-04-17 14:12:59 +09:00
f92c11e1f2 clientv3: translate WithPrefix() into WithFromKey() for empty key 2017-04-16 20:47:18 -07:00
f0143916de clientv3/integration: test fetching entire keyspace 2017-04-16 20:47:18 -07:00
7e3dd74314 Merge pull request #7748 from darasion/master
clientv3/namespace: fix incorrect watching prefix-end
2017-04-15 15:17:35 -07:00
0e7fd4a37c clientv3/namespace: fix incorrect watching prefix-end
using "abc" will watch the wrong range when WithPrefix() specified.
2017-04-15 22:31:50 +08:00
e2d0db95eb Merge pull request #7744 from heyitsanthony/fix-auth-stop-race
auth: fix race on stopping simple token keeper
2017-04-14 12:38:47 -07:00
2951e7f6e4 Merge pull request #7733 from heyitsanthony/fix-client-foreign-dial
clientv3: let client dial endpoints not in the balancer
2017-04-14 10:45:17 -07:00
fdf7798137 auth: fix race on stopping simple token keeper
run goroutine was resetting a field for no reason and without holding a lock.
This patch cleans up the run goroutine management to make the start/stop path
less racey in general.
2017-04-14 09:50:33 -07:00
8efc42e25f etcdctl: show responses of auth RPCs if --write-output=fields is passed 2017-04-14 11:48:42 +09:00
cfbc5e5c3b Merge pull request #7706 from gyuho/wait-apply-conf-change
etcdserver: wait apply on conf change Raft entry
2017-04-13 16:54:06 -07:00
04354f32ab etcdserver: wait apply on conf change Raft entry
When apply-layer sees configuration change entry in
raft.Ready.CommittedEntries, the server should not proceed
until that entry is applied. Otherwise, follower's raft
layer advances, possibly election-timeouts, and becomes
the leader in single-node cluster, before add-node conf
change of other nodes is applied.

Fix https://github.com/coreos/etcd/issues/7595.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-13 15:59:24 -07:00
957c9cd1df Merge pull request #7734 from mitake/status-auth
etcdserver: let Status() not require authentication
2017-04-13 15:53:33 -07:00
8fdfac2843 Merge pull request #7730 from heyitsanthony/return-member-list
*: return updated member list in v3 rpcs
2017-04-13 15:39:38 -07:00
1153e1e7d9 Merge pull request #7687 from heyitsanthony/deny-tls-ipsan
transport: deny incoming peer certs with wrong IP SAN
2017-04-13 15:03:25 -07:00
7607ace95a Merge pull request #7735 from gyuho/grpc-shutdown
pkg/transport: add 'IsClosedConnError'
2017-04-13 13:16:57 -07:00
6c2fb5105d clientv3/integration: use 'transport.IsClosedConnError'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-13 11:55:23 -07:00
56b111df0c rafthttp: use 'transport.IsClosedConnError'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-13 11:55:22 -07:00
8ce579aac9 pkg/transport: add 'IsClosedConnError'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-13 11:55:18 -07:00
9eb3e2c6b4 Merge pull request #7736 from gyuho/todo
embed: remove ReadTimeout TODO
2017-04-13 11:40:53 -07:00
0b19921ec0 Merge pull request #7729 from heyitsanthony/fix-auth-token-crash
auth: protect simpleToken with single mutex and check if enabled
2017-04-13 11:23:15 -07:00
537c7100b0 embed: remove ReadTimeout TODO
ref. https://github.com/golang/go/issues/9524#issuecomment-271937649

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-13 10:55:45 -07:00
2dd361aba5 Merge pull request #7694 from heyitsanthony/report-test
report: add test for Report interface
2017-04-13 10:39:55 -07:00
8077be93b8 Merge pull request #7728 from nokia/clients
Adding C++ bindings
2017-04-13 10:05:18 -07:00
b9f9d2e786 Documentation: Adding a separate v2 and a v3 API binding for C++
To draw the attention of the community to these.
2017-04-13 13:15:06 +02:00
67f2e41f20 etcdserver: let Status() not require authentication
The information that can be obtained with the RPC doesn't need to be
protected.

Fix https://github.com/coreos/etcd/issues/7721
2017-04-13 17:39:09 +09:00
4582a7e900 Merge pull request #7731 from heyitsanthony/remove-dead-srv-arg
discovery: remove dead token argument from SRVGetCluster
2017-04-12 20:09:11 -07:00
46971fa1db integration: test client can dial endpoints not in balancer 2017-04-12 20:07:04 -07:00
9b8e39e7ca clientv3: let client.Dial() dial endpoints not in the balancer 2017-04-12 20:07:03 -07:00
e58d39611a Merge pull request #7725 from heyitsanthony/platform-subsection
Documentation: reshuffle op-guide to include platforms and upgrading
2017-04-12 17:05:14 -07:00
780a7d359c discovery: remove dead token argument from SRVGetCluster
Can add the argument back when it's actually used something.
2017-04-12 16:49:44 -07:00
33a0496b5e report: add test for Report interface 2017-04-12 16:41:32 -07:00
d9ec6b4d22 *: return updated member list in v3 rpcs
Now it's possible to atomically know the new member configuration from
issuing a membership change RPC.
2017-04-12 16:24:51 -07:00
68837b9693 Documentation: reshuffle op-guide to include platforms and upgrading 2017-04-12 15:40:53 -07:00
2046d66927 Merge pull request #7715 from gyuho/fmt
tools/benchmark: fix misc gofmt warnings
2017-04-12 14:27:37 -07:00
2d97500e64 test: do not ignore 'tools/benchmark/cmd'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-12 14:13:30 -07:00
373a04a181 tools/benchmark: fix misc gofmt warnings
ref. https://golang.org/cmd/gofmt/#hdr-The_simplify_command

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-12 14:12:05 -07:00
70a9929b5d transport: use actual certs for listener tests 2017-04-12 13:41:33 -07:00
cad1215b18 *: deny incoming peer certs with wrong IP SAN 2017-04-12 13:41:33 -07:00
18bccb4285 auth: protect simpleToken with single mutex and check if enabled
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.

Fixes #7722
2017-04-12 13:40:09 -07:00
712f6cb0e1 integration: test requests with valid auth token but disabled auth
etcd was crashing since auth was assuming a token implies auth is enabled.
2017-04-12 13:17:33 -07:00
817825d549 Merge pull request #7726 from smeruelo/fix-doc
Documentation: add missing link
2017-04-12 10:57:02 -07:00
79d27328e3 Documentation: add missing link 2017-04-12 19:50:27 +02:00
95c6c4b713 Merge pull request #7712 from heyitsanthony/stm-sersnap
*: rename Snapshot STM isolation to SerializableSnapshot
2017-04-12 09:03:13 -07:00
4f9aa276bd *: rename Snapshot STM isolation to SerializableSnapshot
Pure Snapshot isolation would permit read conflicts. Change the name
from Snapshot to SerializableSnapshot to reflect that it will also
reject read conflicts.
2017-04-11 17:17:50 -07:00
6ebadda395 Merge pull request #7711 from FranGM/master
Documentation: Add Hosted Graphite to prod users
2017-04-11 13:53:13 -07:00
e521a9116f Merge pull request #7693 from heyitsanthony/why-etcd-doc
Documentation/learning: finish why.md
2017-04-11 13:33:17 -07:00
7684bfdf65 Merge pull request #7704 from heyitsanthony/txn-bench
benchmark: add txn-put benchmark
2017-04-11 12:44:20 -07:00
ce2f65508d Documentation: Add Hosted Graphite to prod users 2017-04-11 20:13:57 +01:00
b4869cb03e Documentation/learning: finish why.md 2017-04-11 12:04:46 -07:00
216a6347b2 Merge pull request #7707 from gyuho/net
vendor: update 'golang.org/x/net'
2017-04-11 09:53:03 -07:00
fd5766bdf6 Merge pull request #7708 from gyuho/rkt
*: coreos/rkt -> rkt/rkt
2017-04-11 09:06:00 -07:00
7fb1f68ff8 *: coreos/rkt -> rkt/rkt
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-11 08:48:48 -07:00
a0dc471520 vendor: update 'golang.org/x/net'
There have been a few bug fixes in upstream.
Mainly for our grpc-go sub-dependencies

'idna' package introduces a new dependency 'golang.org/x/text'

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-11 08:46:07 -07:00
4d1b8b1e47 benchmark: add txn-put benchmark
Submits multiple put ops in a single txn.
2017-04-10 17:01:49 -07:00
7da79de74b Merge pull request #7703 from gyuho/rafthttp
rafthttp: move test-only functions to '_test.go'
2017-04-10 16:59:47 -07:00
b694cfc69f Merge pull request #7702 from heyitsanthony/rpc-swagger
v3lock, v3election: generate and serve grpc-gateway endpoints
2017-04-10 16:48:11 -07:00
d26bdbaf81 Merge pull request #7701 from heyitsanthony/cov-strip-generated
test: remove generated files from coverage statistics
2017-04-10 16:22:15 -07:00
8db8d01712 rafthttp: move test-only functions to '_test.go'
Not used in actual code base, only used in tests

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-10 16:07:31 -07:00
2030c85071 test: ignore v3electionpb and v3lockpb for static checks 2017-04-10 15:21:07 -07:00
93594006df embed: register grpc-gateway endpoints for v3lock and v3election 2017-04-10 15:21:07 -07:00
78a5eb79b5 *: add swagger and grpc-gateway assets for v3lock and v3election 2017-04-10 15:21:07 -07:00
b5dd41e625 test: remove generated files from coverage statistics
client/keys.generated.go has poor coverage but it's generated; other
generated files (e.g., pb stuff) are ignored, so this should be ignored too.
2017-04-10 14:30:15 -07:00
a1a72202ff Merge pull request #7666 from calebamiles/aws-platform-guide
Adds AWS platform guide
2017-04-10 14:14:52 -07:00
2a074523a4 Documentation: Adds AWS platform guide
Add guide for deployed etcd on AWS discussing resource planning and cluster design
2017-04-10 13:09:33 -07:00
25acdbf41b Merge pull request #7634 from heyitsanthony/election-rpc
Election RPC service
2017-04-07 20:03:09 -07:00
55e2355326 Merge pull request #7695 from heyitsanthony/upgrade-grpc-gateway
vendor: upgrade grpc-gateway to v1.2.0
2017-04-07 19:04:00 -07:00
5f366db7d1 etcd-runner: update election command to use new Leader() interface 2017-04-07 16:36:38 -07:00
78422eaa17 embed: add Election service 2017-04-07 16:36:38 -07:00
bf047ed9d5 integration: v3 election rpc tests 2017-04-07 16:36:38 -07:00
dc8115a534 v3election: Election RPC service
Fixes #7589
2017-04-07 16:36:38 -07:00
9ba69ff317 scripts: update genproto.sh to include v3election 2017-04-07 16:36:38 -07:00
135a40751e v3rpc: force RangeEnd=nil if length is 0
gRPC will replace empty strings with nil, but for the embedded case it's
possible for []byte{} to slip in and confuse the single key / >= key
watch logic.
2017-04-07 16:36:38 -07:00
4b4f5be74a concurrency: don't skip leader updates in Observe()
The Get for the leader key will fetch based on the latest revision
instead of the deletion revision, missing leader updates between
the delete and the Get.

Although it's usually safe to skip these updates since they're
stale, it makes testing more difficult and in some cases the
full leader update history is desirable.
2017-04-07 16:36:38 -07:00
80c1b9c13a concurrency: support resuming elections if leadership already held
If a client already knows it holds leadership, let it create an
election object with its leadership information.
2017-04-07 16:36:38 -07:00
d1ae4cd5bd concurrency: only delete on election resignation if create revision matches
Addresses a case where two clients share the same lease. A client resigns but
disconnects / crashes and doesn't realize it. Another client reuses the
lease and gets leadership with a new key. The old client comes back and
tries to resign again, revoking the new leadership of the new client.
2017-04-07 16:36:37 -07:00
4b5bb7f212 concurrency: return v3.GetResponse for Election.Leader()
The full information about the leader's key is necessary to
safely use elections with transactions. Instead of returning
only the value on Leader(), return the entire GetResposne.
2017-04-07 16:36:37 -07:00
a6cab69c88 concurrency: expose leader revision and proclaim headers for election 2017-04-07 16:36:37 -07:00
2769cae6bd vendor: upgrade grpc-gateway to v1.2.0 2017-04-07 16:36:14 -07:00
c0560be98a Merge pull request #7692 from heyitsanthony/upgrade-grpc
vendor: upgrade grpc to 1.2.1
2017-04-07 16:04:50 -07:00
9ba435d902 vendor: upgrade grpc to 1.2.1 2017-04-07 14:32:00 -07:00
63bb560820 Merge pull request #7688 from heyitsanthony/short-mask
test: fix fmt pass, shorten warnings, clear SA1016
2017-04-07 12:33:57 -07:00
88d4e7ebeb netutil: fix unused err staticcheck failure
Clears SA4006
2017-04-07 10:52:54 -07:00
7e05b33aa0 *: remove os.Kill from signal.Notify
Clears SA1016 in staticcheck
2017-04-07 10:52:54 -07:00
d31701bab5 test: fix fmt pass and shorten suppression warnings
If gosimple or staticcheck had no output, it no other passes would be
applied because they were using `continue`. Similarly, the suppression
check never worked at all since it wasn't the result data into egrep.

Fixes #7685
2017-04-06 21:33:03 -07:00
25ed908c18 Merge pull request #7684 from gyuho/a
clientv3/integration: fix minor typo in Fatalf
2017-04-06 19:15:51 -07:00
369d561350 clientv3/integration: fix minor typo in Fatalf
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-06 18:26:42 -07:00
7c5991c2e6 Merge pull request #7676 from fanminshi/add_dns_srv
etcdmain: support SRV discovery for gRPC proxy
2017-04-06 12:32:40 -07:00
bea4c62965 Merge pull request #7677 from heyitsanthony/fix-waitsubstream
clientv3: register waitCancelSubstreams closingc goroutine with waitgroup
2017-04-06 11:10:06 -07:00
2bc1dfd921 etcdmain: support SRV discovery for gRPC proxy
FIX #7562
2017-04-06 10:45:19 -07:00
e1cf766695 Merge pull request #7674 from gyuho/debug
ctlv3: add '--debug' flag (to enable grpclog)
2017-04-05 17:39:37 -07:00
7388911e0c ctlv3: add '--debug' flag (to enable grpclog)
By default, grpclog is disabled. It should be configurable
for debugging purposes, as we did in v2.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-05 17:11:31 -07:00
aab2eda7df clientv3: register waitCancelSubstreams closingc goroutine with waitgroup
Fixes #7598
2017-04-05 16:06:53 -07:00
408de4124b Merge pull request #7675 from gyuho/tls-min-version
clientv3/yaml: use TLS 1.2 in min version
2017-04-05 12:58:16 -07:00
dee467dc24 clientv3/yaml: use TLS 1.2 in min version
To be consistent with 'pkg/transport'

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-05 11:50:35 -07:00
83577a5d08 Merge pull request #7670 from heyitsanthony/fix-lease-be-race
lease: acquire BatchTx lock in fakeDeleter
2017-04-05 11:08:15 -07:00
d42c1f5131 Merge pull request #7646 from andelf/fix-unix-socket-url
*: fix a bug in handling unix socket urls
2017-04-05 09:24:38 -07:00
43f795a485 Merge pull request #7659 from gyuho/aaa
pkg/transport: remove port in Certificate.IPAddresses
2017-04-05 04:29:44 -07:00
4f27981c46 *: fix a bug in handling unix socket urls
Now use url.Host + url.Path as unix socket path

Fixes #7644
2017-04-05 14:33:13 +08:00
c7bdd7e2c5 Merge pull request #7669 from mitake/byte-affine
auth, adt: introduce a new type ByteAffineComparable
2017-04-05 15:19:08 +09:00
c4a45c5713 auth, adt: introduce a new type BytesAffineComparable
It will be useful for avoiding a cost of casting from string to
[]byte. The permission checker is the first user of the type.
2017-04-05 13:17:24 +09:00
42d56d5ef7 lease: acquire BatchTx lock in fakeDeleter
Revoke expects the BatchTx lock to be held when holding the TxnDeleter
because it updates the lease bucket. The tests don't hold the lock so
it may race with the backend commit loop.

Fixes #7662
2017-04-04 20:52:23 -07:00
d51d381eca Merge pull request #7656 from gyuho/more-adapter
*: add cluster API adapter
2017-04-04 20:10:24 -07:00
63355062dc Merge pull request #7649 from mitake/range-open-ended
etcdctl: add a new option --open-ended for unlimited range permission
2017-04-05 11:03:52 +09:00
f7c99208b5 Merge pull request #7667 from ElijahCaine/relative-links-1
Docs: replace absolute links with relative ones.
2017-04-04 18:33:02 -07:00
c0fc389c98 Merge pull request #7661 from heyitsanthony/cov-fail-report
test: generate coverage report even if some tests fail
2017-04-04 16:46:46 -07:00
31c1931b7b Docs: replace absolute links with relative ones. 2017-04-04 15:21:42 -07:00
6978471712 Merge pull request #7664 from gyuho/safe-revision-access
auth: use atomic access to 'authStore.revision'
2017-04-04 13:56:20 -07:00
3edd36315d auth: use atomic access to 'authStore.revision'
Fix https://github.com/coreos/etcd/issues/7660.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-04 13:16:02 -07:00
23e952ccfd test: generate coverage report even if some tests fail
The coverage data is still useful even if some tests fail. Instead of
terminating the coverage pass on any test failure, collect and pass
the failed tests, generate the coverage report, then report the failed
packages and exit with an error.
2017-04-04 11:12:18 -07:00
1e3274dfa2 integration: use cluster adapter in tests
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-04 10:08:51 -07:00
8a7a548a6d pkg/transport: remove port in Certificate.IPAddresses
etcd passes 'url.URL.Host' to 'SelfCert' which contains
client, peer port. 'net.ParseIP("127.0.0.1:2379")' returns
'nil', and the client on this self-cert will see errors
of '127.0.0.1 because it doesn't contain any IP SANs'

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-04 09:44:59 -07:00
d9069120bb Merge pull request #7657 from gyuho/auth-cleanup
clientv3: remove unused fields from 'auth'
2017-04-04 09:42:17 -07:00
972d8c55ab Merge pull request #7653 from xiang90/pprof
*: add pprof flag to grpc proxy
2017-04-04 09:22:50 -07:00
9bc3c0bd05 clientv3: remove unused fields from 'auth'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-04 08:17:36 -07:00
7f2d6b3ef6 clientv3,v3client: add cluster embedded client
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-04 08:14:18 -07:00
7adf4d7c94 grpcproxy/adapter: add Cluster API support
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-04 08:13:30 -07:00
a204b14503 e2e: add a test case for the --open-ended option 2017-04-04 17:28:59 +09:00
0a7fc7cd34 etcdctl: add a new option --from-key for unlimited range permission
This commit adds a new option --from-key to the command etcdctl role
grant-permission. If the option is passed, an open ended permission
will be granted to a role e.g. from start-key to any keys those are
larger than start-key.

Example:
$ ETCDCTL_API=3 bin/etcdctl --user root:p role grant r1 readwrite a b
$ ETCDCTL_API=3 bin/etcdctl --user root:p role grant --from-key r1 readwrite c
$ ETCDCTL_API=3 bin/etcdctl --user root:p role get r1
Role r1
KV Read:
        [a, b) (prefix a)
        [c, <open ended>
KV Write:
        [a, b) (prefix a)
        [c, <open ended>

Note that a closed parenthesis doesn't follow the above <open ended>
for indicating that the role has an open ended permission ("<open
ended>" is a valid range end).

Fixes https://github.com/coreos/etcd/issues/7468
2017-04-04 17:28:59 +09:00
fd5984af56 *: add pprof flag to grpc proxy 2017-04-03 22:07:17 -07:00
d6efc0b22b Merge pull request #7651 from heyitsanthony/ivt-contains-intersects
*: support checking that an interval tree's keys cover an entire interval
2017-04-03 20:20:56 -07:00
f67bdc2eed *: support checking that an interval tree's keys cover an entire interval 2017-04-03 15:38:07 -07:00
63c6824905 Merge pull request #7650 from philips/add-dims-v3
Documentation: add dims v3 gateway API for python
2017-04-03 15:08:03 -07:00
7dbc4549d9 Merge pull request #7652 from heyitsanthony/fix-gofmt-clientv3
clientv3: fix go1.8 go fmt warning in test
2017-04-03 15:05:41 -07:00
a0149106b8 clientv3: fix go1.8 go fmt warning in test 2017-04-03 14:00:06 -07:00
8963cf2f8b Documentation: add dims v3 gateway API for python 2017-04-03 12:55:24 -07:00
e56e43064f Merge pull request #7637 from lumjjb/patch-2
Documentation: add encryption wrapper to integrations
2017-04-03 12:34:08 -07:00
f13bea0bb0 Merge pull request #7639 from heyitsanthony/fix-userflag-timeout
clientv3: respect dial timeout in auth
2017-04-03 09:30:48 -07:00
ea06ea41e5 Merge pull request #7641 from ggaaooppeenngg/fix-id-doc
idgen: correct comments for id generator
2017-04-03 09:17:55 -07:00
24e4c94d98 Merge pull request #7640 from heyitsanthony/etcdserver-ctx
etcdserver: ctx-ize server initiated requests
2017-04-03 09:07:28 -07:00
8dafaf390a Merge pull request #7642 from davissp14/integration-doc-update
Documentation: Adding new Ruby v3 client entry to integrations.md
2017-04-02 21:55:38 -07:00
8d07200bbf Documentation: Adding new Ruby v3 client entry to integrations.md 2017-04-02 23:54:07 -05:00
38a9149735 Merge pull request #7569 from mitake/interval
auth: store cached permission information in a form of interval tree
2017-04-03 02:41:31 +02:00
d204b6c3b7 idgen: correct comments for id generator
Comments for id generator format is out of
date, correct it.

Fixes #7636

Signed-off-by: Peng Gao <peng.gao.dut@gmail.com>
2017-04-02 20:56:10 +08:00
f5f4791023 integration: test cluster terminates quickly 2017-03-31 19:19:33 -07:00
8ad935ef2c etcdserver: use cancelable context for server initiated requests 2017-03-31 19:19:33 -07:00
5aebe1a52d clientv3: test dial timeout is respected when using auth 2017-03-31 15:14:46 -07:00
62d7bae496 clientv3: respect dial timeout when authenticating
Fixes #7627
2017-03-31 15:14:46 -07:00
e6b685b1ed Documentation: add encryption wrapper to integrations 2017-03-31 13:02:53 -04:00
512bac0ee9 Merge pull request #7630 from heyitsanthony/fix-lease-req-leader
clientv3: support WithRequireLeader in lease client
2017-03-31 09:52:17 -07:00
8024a0d15f clientv3: support WithRequireLeader in lease client
Unconditionally opens a WithRequireLeader stream in the lease client. Any
keep alive channels opened using WithRequireLeader will be closed when
the leader is lost.

Fixes #7275
2017-03-30 21:39:36 -07:00
7db7744737 clientv3/integration: test lease WithRequireLeader 2017-03-30 20:18:33 -07:00
833769f59f v3rpc: return leader loss error if lease stream is canceled
Canceling the stream won't cancel the receive since it's using the internal
grpc context, not the one assigned by etcd.
2017-03-30 20:18:33 -07:00
b55ea6a70b integration: test require leader for a lease stream 2017-03-30 20:18:33 -07:00
9ca7f22e84 Merge pull request #7614 from jsok/7516-default-initial-cluster
embed: Delay setting initial cluster
2017-03-30 18:01:51 -07:00
0472b2dc9f etcdmain: test config file clustering flags
A test to ensure that when clustering flags are correctly and
independently specified no errors are raised.
2017-03-31 10:01:46 +11:00
d0d4b1378b embed: Delay setting initial cluster for YAML
NewConfig() sets an initial cluster (potentially using a default name)
but we should clear it in the event another discovery option has been
specified.

PR #7517 attempted to address this however it only worked if the name
was left as "default".

(Completely) Fixes #7516
2017-03-31 10:01:42 +11:00
ca22c4c384 Merge pull request #7632 from xiang90/fix_periodic
compactor: fix TestPeriodic
2017-03-30 15:13:46 -07:00
ef3bd4ecc5 Merge pull request #7633 from heyitsanthony/protoc-3.2.0
*: use protoc 3.2.0
2017-03-30 15:10:14 -07:00
809e6110a0 compactor: fix TestPeriodic
Perviously, we advance checkCompactionInterval more than we should.
The compaction might happen nondeterministically since there is no
synchronization before we call clock.Advance().

The number of rg.Wait() should be equal to the number of Advance() if
compactor routine and test routine run at the same pace. However, in our current
test, we call Advance() more than rg.Wait().

It works OK when the compactor routine runs "slower" than the test routine, which
is the common case. However, when the speed changes, the compactor routine might
block rg.Rev() since there is not enough calls of rg.Wait().

This commit forces the compactor and test routine to run at the same pace. And we supply
the exact number of Advance() and wg.Wait() that compactor needs.
2017-03-30 15:00:49 -07:00
1ff0b71b30 *: use protoc 3.2.0
Fixes #7631
2017-03-30 13:43:10 -07:00
a0c97282c3 Merge pull request #7626 from akauppi/pr-doc-typos
Fixing small typos in documentation
2017-03-30 13:36:38 -07:00
dae2755253 Documentation: fix typos 2017-03-30 11:41:50 +03:00
36735d52a4 Merge pull request #7622 from heyitsanthony/faq-disk-leader
Documentation: add disk latency leader loss question to FAQ
2017-03-28 19:18:50 -07:00
eafab47f05 Merge pull request #7612 from gyuho/adapter-maintenance-API
*: adapter maintenance api
2017-03-28 16:38:20 -07:00
faad828c51 Documentation: add disk latency leader loss question to FAQ 2017-03-28 15:49:21 -07:00
6b784908ad Merge pull request #7621 from xiang90/c_d
compactor: make TestPeriodic die early
2017-03-28 15:38:59 -07:00
c90a4b96d1 integration: use maintenance API adapter in tests
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-28 14:12:47 -07:00
0bf110e27f clientv3,v3client: maintenance to embedded client
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-28 14:12:43 -07:00
a915ff8419 compactor: make TestPeriodic die early 2017-03-28 13:50:16 -07:00
5c642ae314 grpcproxy/adapter: add maintenance API support
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-28 09:09:06 -07:00
123b25845c Merge pull request #7610 from gyuho/news
NEWS: add v3.1.4, v3.1.5
2017-03-28 05:45:10 -07:00
65ad91b14d Merge pull request #7591 from xiang90/validate
etcdctl: add initial check perf command
2017-03-27 17:23:58 -07:00
60d3375599 etcdctl: add initial check perf command 2017-03-27 17:01:15 -07:00
a4ab5e55f9 Merge pull request #7611 from xiang90/auth_design
doc: link auth design in doc
2017-03-27 13:08:40 -07:00
4c7ffe4442 Merge pull request #7605 from gyuho/wrap-adapter
proxy/grpcproxy: add chanStream helper
2017-03-27 13:01:19 -07:00
fded83f111 doc: link auth design in doc 2017-03-27 11:58:32 -07:00
e70c8ac4a2 Merge pull request #7508 from mitake/auth-v3-design
auth: import design doc
2017-03-27 11:35:55 -07:00
caa73c176f proxy/grpcproxy: add chanStream helper
Prelimiary work for maintenance API in adapter

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-27 11:24:02 -07:00
5dea73860f NEWS: add v3.1.4, v3.1.5
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-27 10:54:51 -07:00
e6f72b4f42 Merge pull request #7603 from heyitsanthony/leak-check-grpc
testutil: check for grpc resources in AfterTest
2017-03-27 10:33:25 -07:00
9381b103bb Merge pull request #7601 from heyitsanthony/fix-proxy-compact
grpcproxy/cache: only check compaction revision for historical revisions
2017-03-27 09:23:17 -07:00
2e1e1c95bd auth: import design doc
This commit imports and refines the design doc of v3 auth:
https://goo.gl/fwBxz6
2017-03-27 07:53:32 -07:00
da6a035afb Merge pull request #7600 from raoofm/patch-10
op-guide: Remove guest role from v3 auth doc
2017-03-25 10:08:36 +09:00
997e83f8ea testutil: check for grpc resources in AfterTest
gRPC leaks only show up at the final leak check, making it difficult to
determine which test is causing the leak.
2017-03-24 16:09:38 -07:00
631f790689 Merge pull request #7574 from fanminshi/fix_mem_leak
raft: use rs.req.Entries[0].Data as the key for deletion in advance()
2017-03-24 15:50:17 -07:00
b2a465e354 grpcproxy/cache: only check compaction revision for historical revisions
Since the current revision is 0, it'll always be less than the compaction
revision. If the proxy sees a compaction, it would always reject the
current revision requests since it's less than the compaction revision.
Instead, check if the revision is historical before trying to reject on
compaction revision.

Fixes #7599
2017-03-24 13:20:46 -07:00
b9cfa4cef9 integration: add serialized range to TestV3CompactCurrentRev
To catch compaction bugs in the proxy key cache.
2017-03-24 13:13:38 -07:00
a26964c855 op-guide: Remove guest role from v3 auth doc 2017-03-24 16:09:58 -04:00
f18ae033a7 raft: use rs.req.Entries[0].Data as the key for deletion in advance()
advance() should use rs.req.Entries[0].Data as the context instead of
req.Context for deletion. Since req.Context is never set, there won't be
any context being deleted from pendingReadIndex; results mem leak.

FIXES #7571
2017-03-24 12:31:21 -07:00
608a2be9c5 Merge pull request #7596 from andelf/fix-typo-bucked
etcdserver: fix a typo in bucket name var
2017-03-24 09:51:05 -07:00
f763048156 Merge pull request #7592 from heyitsanthony/proxy-cov
test: add proxy to coverage tests
2017-03-24 09:42:57 -07:00
54efb460af etcdserver: fix a typo in bucket name var 2017-03-24 13:11:01 +08:00
ab1cf751a3 test: add proxy to coverage tests 2017-03-23 18:27:09 -07:00
ad2111a6f4 auth: store cached permission information in a form of interval tree
This commit change the type of cached permission information from the
home made thing to interval tree. It improves computational complexity
of permission checking from O(n) to O(lg n).
2017-03-24 09:36:14 +09:00
e9bfcc02ce Merge pull request #7590 from gyuho/test
integration: retry TestNetworkPartition5MembersLeaderInMajority
2017-03-23 17:02:32 -07:00
b81cb999fb integration: retry TestNetworkPartition5MembersLeaderInMajority
Fix https://github.com/coreos/etcd/issues/7587.

Retry for possible leader election in majority.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-23 16:30:38 -07:00
204335d304 Merge pull request #7560 from artem-panchenko/fix_container_resolving
Dockerfile-release: add nsswitch.conf into image
2017-03-23 15:20:14 -07:00
54928f5deb Merge pull request #7524 from mitake/del-and-revoke-role
auth: changes of managing roles and users
2017-03-23 15:10:10 -07:00
3f8eab8439 Merge pull request #7581 from heyitsanthony/ivt-sorted-visit
adt: Visit() interval trees in sorted order
2017-03-23 14:11:08 -07:00
21217c30f9 Merge pull request #7583 from krmayankk/prod-users
add Salesforce to prod users
2017-03-23 12:33:30 -07:00
37bdc94860 Documentation: add salesforce to prod users 2017-03-23 12:29:37 -07:00
36ece32a61 Merge pull request #7582 from heyitsanthony/fix-watch-stream-leak
clientv3: use waitgroup to wait for substream goroutine teardown
2017-03-23 12:24:06 -07:00
0256953b28 Merge pull request #7586 from gyuho/timeout
tools/etcd-tester: add timeout for 'defrag'
2017-03-23 10:23:42 -07:00
8afc468b64 tools/etcd-tester: add timeout for 'defrag'
etcd panic-ed, so defrag response just blocked for "days"
when the actual 'v3rpc' path never returned.

We should catch this earlier.

ref. https://github.com/coreos/etcd/issues/7526

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-23 10:22:20 -07:00
161c7f6bdf Merge pull request #7579 from gyuho/fix-defrage
*: fix panic during defrag operation
2017-03-23 10:08:33 -07:00
23719f99c6 Merge pull request #7563 from heyitsanthony/fix-testdialcancel-leak
clientv3: wait for Get goroutine in TestDialCancel
2017-03-23 10:07:23 -07:00
7ef75e373a Merge pull request #7525 from heyitsanthony/big-backend
etcdserver, backend: configure mmap size based on quota
2017-03-23 10:06:00 -07:00
9dcb975724 Merge pull request #7556 from brancz/prom-rules
Documentation: add Prometheus alerting rules
2017-03-23 09:58:57 -07:00
ed68bf89ff integration: test inflight range requests while defragmenting
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-23 09:48:04 -07:00
26abd25cd3 mvcc/backend: hold 'readTx.Lock' until completing bolt.Tx reset
Fix https://github.com/coreos/etcd/issues/7526.

When resetting `bolt.Tx` in `defrag` and `batchTxBuffered.commit`
operation, we do not hold `readTx` lock, so the inflight range
requests can trigger panic in `mvcc.Range` paths. This fixes by
moving mutexes out and hold it while resetting the `readTx`.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-23 09:47:43 -07:00
e7a0c9128a Documentation: add Prometheus alerting rules 2017-03-23 09:43:38 +01:00
8d0d942c47 e2e: add a test case for invalid management of root user and role 2017-03-23 16:47:58 +09:00
c40b86bcde auth, etcdserver: forbid invalid auth management
If auth is enabled,
1. deleting the user root
2. revoking the role root from the user root
must not be allowed. This commit forbids them.
2017-03-23 16:47:58 +09:00
0c87467f69 e2e: add a test case role delete and revoke 2017-03-23 16:47:44 +09:00
068d806bde *: revoke a deleted role
This commit resolves a TODO of auth store:
Current scheme of role deletion allows existing users to have the
deleted roles. Assume a case like below:
create a role r1
create a user u1 and grant r1 to u1
delete r1

After this sequence, u1 is still granted the role r1. So if admin
create a new role with the name r1, The new r1 is automatically
granted u1. In some cases, it would be confusing. So we need to
revoke the deleted role from all users.
2017-03-23 16:44:19 +09:00
25e3ce1feb adt: Visit() interval trees in sorted order and terminate early
For all intervals [x, y), Visit will visit intervals in ascending order
sorted by x. Also fixes a bug where Visit would not terminate the search
when requested by the visitor function.
2017-03-23 00:02:29 -07:00
a39107a3b8 clientv3: use waitgroup to wait for substream goroutine teardown
When a grpc watch stream is torn down, it will join on its logical substream
goroutines by waiting for each to close a channel. This doesn't guarantee
the substream is fully exited, though, but only about to exit and can be
waiting to resume even after Watch.Close finishes. Instead, use a
waitgroup.Done at the very end of the substream defer.

Fixes #7573
2017-03-22 23:27:26 -07:00
049ca8746a Merge pull request #7549 from heyitsanthony/namespace-proxy
namespace proxy
2017-03-22 23:26:52 -07:00
85f989ab3d Documentation, op-guide, clientv3: add documentation for namespacing 2017-03-22 16:45:38 -07:00
397a42efbe etcdmain: add prefixing support to grpc proxy
Fixes #6577
2017-03-22 16:45:38 -07:00
f35d7d9608 integration: test namespacing on proxy layer
Hardcode a namespace over the testing grpcproxy.
2017-03-22 16:45:38 -07:00
66d147766f clientv3/integration: simple namespace wrapper tests 2017-03-22 16:45:38 -07:00
facbb64090 Merge pull request #7578 from joshix/patch-1
etcd-2-1-0-bench: Fix an absolute bare link to resource outside of Doc dir
2017-03-22 15:45:58 -07:00
e0de6536c8 etcd-2-1-0-bench: Fix an absolute bare link to resource outside of Documentation dir 2017-03-22 15:27:21 -07:00
1f8c7b33e7 namespace: a wrapper for clientv3 to namespace requests 2017-03-22 14:09:09 -07:00
f9b6066dd6 clientv3: make ops and compares non-opaque and mutable
Fixes #7250
2017-03-22 14:08:59 -07:00
da10d5d057 Merge pull request #7572 from heyitsanthony/fix-restart-member
integration: wait on leader before progress check in TestRestartMember
2017-03-22 14:08:07 -07:00
9f34d3493d integration: wait on leader before progress check in TestRestartMember
In rare cases, the last member may not have the leader by the time the
final cluster progress check tries to open a watch, causing a timeout.
2017-03-22 12:48:31 -07:00
1a75165ed8 Merge pull request #7568 from heyitsanthony/clientv3-redundant-err
clientv3: remove redundant error handling code
2017-03-22 08:55:54 -07:00
dd465d0e40 clientv3: remove redundant error handling code 2017-03-22 01:08:23 -07:00
ff6d6867b0 Merge pull request #7523 from mitake/auth-v3-doc
Documentation: add a doc of v3 auth
2017-03-21 22:46:37 -07:00
5cda22a17d Documentation: add a doc of v3 auth
It is almost same to Documentation/v2/authentication.md because a
major part of its user interface is shared with the v2 auth. The newly
added doc includes some refinements for the v3 auth.
2017-03-22 11:26:54 +09:00
9e034f4b4b Merge pull request #7564 from gyuho/test
client/integration: use only digits in unix port
2017-03-21 17:40:47 -07:00
22c52b6d2e client/integration: use only digits in unix port
Fix https://github.com/coreos/etcd/issues/7558.

Same as https://github.com/coreos/etcd/issues/6959.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-21 17:10:59 -07:00
d1a9ccb2b9 clientv3: wait for Get goroutine in TestDialCancel 2017-03-21 16:43:39 -07:00
6511171725 Merge pull request #7561 from gyuho/travis
travis: always 'go get -u' in 'before_install'
2017-03-21 14:16:55 -07:00
e127214c6c travis: always 'go get -u' in 'before_install'
See https://github.com/dominikh/go-tools/issues/76#issuecomment-288189194.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-21 12:21:55 -07:00
327e255695 Merge pull request #7546 from gyuho/fix-blocking-etcd-process
*: fix blocking etcd process
2017-03-21 12:04:53 -07:00
7698a2a546 Merge pull request #7553 from xiang90/fix_defrag
backend: add FillPercent option
2017-03-21 11:16:17 -07:00
2d5f890091 integration: ensure 'StopNotify' on publish error
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-21 10:29:00 -07:00
17e2e762b1 etcdmain: handle StopNotify when ErrStopped aborted publish
Fix https://github.com/coreos/etcd/issues/7512.

If a server starts and aborts due to config error,
it is possible to get stuck in ReadyNotify waits.
This adds select case to get notified on stop channel.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-21 10:22:39 -07:00
cd70ea33ce Merge pull request #7552 from mitake/ordinary
e2e, etcdserver: fix wrong usages of ordinal
2017-03-21 09:42:27 -07:00
95870a21eb backend: add FillPercent option 2017-03-21 08:06:03 -07:00
5594f695bc e2e, etcdserver: fix wrong usages of ordinal
They must be "ordinary".
2017-03-21 23:50:16 +09:00
b9d91483d0 Dockerfile-release: add nsswitch.conf into image
The file '/etc/nsswitch.conf' is created in order to
take in account '/etc/hosts' entries while resolving
domain names.
2017-03-21 13:08:42 +02:00
004c1388fb Merge pull request #7541 from heyitsanthony/remove-legacy-range
etcdserver: remove legacy range/txn
2017-03-20 19:22:39 -07:00
27550b229a Merge pull request #7545 from gyuho/go1.7-go1.8
*: use 'io.Seek*' for go1.7+
2017-03-20 16:31:21 -07:00
effa6e0767 etcdserver: remove legacy range/txn
Needed for 3.0->3.1. Not needed for 3.1->3.2
2017-03-20 15:17:17 -07:00
aca2abd8fe *: use 'io.Seek*' for go1.7+
For https://github.com/coreos/etcd/issues/6174.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-20 15:15:24 -07:00
3a1368d4d2 Merge pull request #7543 from heyitsanthony/fix-timeseries
*: fix gosimple warning for fmt.Sprintf("%s")
2017-03-20 15:02:13 -07:00
ae7b4ee8ed *: fix gosimple warning for fmt.Sprintf("%s") 2017-03-20 13:26:39 -07:00
53ca03b655 Merge pull request #7539 from heyitsanthony/fix-protobuf-help
ctlv3: have "protobuf" in output help string instead of "proto"
2017-03-20 11:26:13 -07:00
cfdad38f4e Merge pull request #7531 from heyitsanthony/fix-mem-remove-again
e2e: force endpoint for member removal
2017-03-20 09:52:30 -07:00
432c19de61 ctlv3: have "protobuf" in output help string instead of "proto"
Fixes #7538
2017-03-20 09:40:21 -07:00
fba87558a6 Merge pull request #7529 from fanminshi/fix_closing_embedded_error
embed: don't return error when closing on embed etcd
2017-03-17 17:02:22 -07:00
21ac657e67 e2e: force endpoint for member removal
e2e tests use different invocations of etcdctl, so the endpoint used to get
the member list will not necessarily be the same to make the remove call.
Instead, select an endpoint that is not being remove, and connect with that.
2017-03-17 16:24:54 -07:00
8a3fee15a3 etcdserver, backend: only warn if exceeding max quota 2017-03-17 15:38:57 -07:00
5e4b008106 *: base initial mmap size on quota size 2017-03-17 15:38:49 -07:00
f292a4c953 embed: don't return error when closing on embed etcd
FIXES #7019
2017-03-17 13:41:05 -07:00
5015480e0c Merge pull request #7517 from jsok/7516-discovery-flags
embed: Delay setting initial cluster
2017-03-16 09:17:42 -07:00
79f4c196b8 Merge pull request #7518 from heyitsanthony/filepath
*: replace path.Join on files with filepath.Join
2017-03-16 08:59:49 -07:00
2f1542c06d *: use filepath.Join for files 2017-03-16 07:46:06 -07:00
1a91ed0e99 embed: Clear default initial cluster
NewConfig() should sets initial cluster from name but we should clear it
in the event that another discovery option has been specified.

Fixes #7516
2017-03-16 13:59:06 +11:00
d78b03fb27 Merge pull request #7515 from tessr/master
wal: use path/filepath instead of path
2017-03-15 19:25:08 -07:00
39c733ebe7 wal: use path/filepath instead of path
Use the path/filepath package instead of the path package. The
path package assumes slash-separated paths, which doesn't work
on Windows. But path/filepath manipulates filename paths in a way
that's compatible across OSes.
2017-03-15 17:30:23 -07:00
5856c8bce9 Merge pull request #7513 from gyuho/raft-applied-term
etcdserver: remove possibly compacted entry look-up
2017-03-15 13:35:36 -07:00
80c10e150f etcdserver: remove possibly compacted entry look-up
Fix https://github.com/coreos/etcd/issues/7470.

This patch removes unnecessary term look-up in
'createMergedSnapshotMessage', which can trigger panic
if raft entry at etcdProgress.appliedi got compacted
by subsequent 'MsgSnap' messages--if a follower is
being (in this case, network latency spikes), it
could receive subsequent 'MsgSnap' requests from leader.

etcd server-side 'applyAll' routine and raft's Ready
processing routine becomes asynchronous after raft
entries are persisted. And given that raft Ready routine
takes less time to finish, it is possible that second
'MsgSnap' is being handled, while the slow 'applyAll'
is still processing the first(old) 'MsgSnap'. Then raft
Ready routine can compact the log entries at future
index to 'applyAll'. That is how 'createMergedSnapshotMessage'
tried to look up raft term with outdated etcdProgress.appliedi.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-15 12:46:56 -07:00
902c676cdb Merge pull request #7397 from fanminshi/fix_SetEndpoints
clientv3/balancer: update eps if pinAddr is not included in updateAddrs
2017-03-15 12:16:15 -07:00
a23609efe6 clientv3: update eps if pinAddr is not included in updateAddrs
FIXES #7392
2017-03-15 11:03:25 -07:00
a2a6b693f1 Merge pull request #7511 from heyitsanthony/fix-v3client-embed
v3client: fix doc to use e.Server
2017-03-15 10:28:41 -07:00
dea2516177 v3client: fix doc to use e.Server
Was passing embed.Etcd instead of etcdserver.EtcdServer.
2017-03-15 09:17:17 -07:00
8f83d11724 Merge pull request #7499 from heyitsanthony/fix-etcdctl-add-member-env
ctlv3: ensure synced member list before printing env vars on member add
2017-03-15 08:59:04 -07:00
27960911af Merge pull request #7500 from heyitsanthony/fix-balancer-test-leak
clientv3: synchronize on goroutines in TestBalancerDoNotBlockOnClose
2017-03-15 08:58:03 -07:00
7a6b61cd6f Merge pull request #7504 from heyitsanthony/fix-watch-wait
clientv3: close open watch channel if substream is closing on reconnect
2017-03-15 08:57:14 -07:00
df839f3b7f Merge pull request #7497 from xiang90/fix_candidate
etcdserver: candidate should wait for applying all configuration changes
2017-03-14 20:10:02 -07:00
3e86779ad5 ctlv3: ensure synced member list before printing env vars on member add
In cases of multiple endpoints, it's possible member add would get a its
member list from a member that has not yet recognized the membership
update. Instead, confirm that the member list response is from the
member that acked the member add or from a member that has synced
with the cluster following the member add.

Fixes #7498
2017-03-14 20:01:44 -07:00
b36734f1d3 clientv3: synchronize on goroutines in TestBalancerDoNotBlockOnClose
Was leaking dialers.
2017-03-14 19:53:33 -07:00
18a813a9fe Merge pull request #7496 from heyitsanthony/v3client-doc
v3client: add example and godoc New
2017-03-14 19:50:01 -07:00
a087325452 clientv3: close open watch channel if substream is closing on reconnect
If substream is closing but outc is still open while reconnecting, then outc
would only be closed once the watch client would connect or once the watch
client is closed. This was leading to deadlocks in the proxy tests. Instead,
close immediately if the context is canceled.

Fixes #7503
2017-03-14 17:25:18 -07:00
7f0733cf46 etcdserver: candidate should wait for applying all configuration changes 2017-03-14 17:20:20 -07:00
eed4a3f035 Merge pull request #7502 from gyuho/scripts
test: mask go1.8 gosimple warnings
2017-03-14 17:00:01 -07:00
a9588952a0 test: mask go1.8 gosimple warnings 2017-03-14 15:10:32 -07:00
ace3a217b0 Merge pull request #7483 from fanminshi/add_tests_to_mutex
integration: add TestMutexWaitsOnCurrentHolder test
2017-03-14 13:01:47 -07:00
276039e835 integration: add TestMutexWaitsOnCurrentHolder test
TestMutexWaitsOnCurrentHolder ensures a series of waiters
obtain lock only after the previous lock requests are gone.
2017-03-14 11:00:07 -07:00
01d1a579bc v3client: add example and godoc New 2017-03-14 10:50:41 -07:00
781196fa87 Merge pull request #7495 from heyitsanthony/more-cov
test: add coverage for more packages
2017-03-14 09:31:01 -07:00
e3218e2dd1 test: add coverage for more packages
Was only getting coverage for packages with test files. Instead, include
packages that don't have test files as well.
2017-03-14 01:08:07 -07:00
1a6be700d8 Merge pull request #7444 from heyitsanthony/lock-service
grpc lock service
2017-03-14 00:01:34 -07:00
148c923c72 Merge pull request #7492 from heyitsanthony/simpletokenttl-deadlock
auth: get rid of deadlocking channel passing scheme in simpleTokenTTL
2017-03-14 14:01:23 +09:00
4409932132 auth: test concurrent authentication 2017-03-13 21:11:35 -07:00
1b1fabef8f auth: get rid of deadlocking channel passing scheme in simpleTokenTTL
Just use the mutex instead.

Fixes #7471
2017-03-13 21:11:35 -07:00
3a61fe596b Merge pull request #7423 from purpleidea/feat/clientv3util-examples
clientv3util: Add KeyExists and KeyMissing examples
2017-03-13 17:26:57 -07:00
94d5936180 Update example_key_test.go 2017-03-13 16:54:26 -07:00
7b541f9003 Merge pull request #7491 from heyitsanthony/learning-api
doc/learning: complete the api guide
2017-03-13 15:39:04 -07:00
300323fa50 integration: test grpc lock service 2017-03-13 15:23:26 -07:00
ad1a790116 embed: serve lock api 2017-03-13 15:23:26 -07:00
c737bf3d2a scripts: generate lock service rpc stubs 2017-03-13 15:23:26 -07:00
47cd9d0277 v3lock: server-side api for locking 2017-03-13 15:23:26 -07:00
763a37d3f1 v3client: a bridge between an etcdserver and a clientv3 2017-03-13 15:23:26 -07:00
d51c8bb640 concurrency: support returning response header for mutex 2017-03-13 15:23:26 -07:00
a2cdd908dc clientv3: permit creating client without grpc connection
For creating client from etcdserver.
2017-03-13 15:23:26 -07:00
b025cdd097 adapter, integration: split out grpc adapters from grpcproxy package
Break cyclic dependency:
clientv3/naming <-> integration <-> v3client <-> grpcproxy <-> clientv3/naming
2017-03-13 15:23:26 -07:00
90b5f3587d doc/learning: complete the api guide
Fixes #7378
2017-03-13 14:34:12 -07:00
5193965005 Merge pull request #7481 from heyitsanthony/testafter-clientv3
clientv3: use CheckAfterTest after terminating cluster
2017-03-13 13:25:52 -07:00
34fca0caa9 Merge pull request #7476 from gyuho/NEWS
NEWS: update v3.1.3
2017-03-13 13:17:52 -07:00
312ac5824f Merge pull request #7486 from oberstet/doc-integr-add-txaio-etcd
add txaio-etcd to intergrations.md
2017-03-13 11:42:08 -07:00
d051b3b4e4 Documentation: add txaio-etcd to integrations 2017-03-13 18:24:46 +01:00
76aa7f6935 Merge pull request #7479 from heyitsanthony/auth-admin-nilcheck
auth: nil check AuthInfo when checking admin permissions
2017-03-11 23:30:33 -08:00
bf0aa68f89 Merge pull request #7480 from raoofm/patch-10
op-guide: update gateway routing policy
2017-03-11 23:29:59 -08:00
fbcc6db64c Merge pull request #7482 from gyuho/lll
discovery: fix print format
2017-03-10 17:05:40 -08:00
60bdc47fa0 discovery: fix print format
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-10 15:04:46 -08:00
38f27599b9 op-guide: update gateway routing policy
Update from single available endpoint to round robin.
2017-03-10 17:43:10 -05:00
593489d454 clientv3: use CheckAfterTest after terminating cluster
AfterTest() has a delay that waits for runtime goroutines to exit;
CheckLeakedGoroutine does not. Since the test runner manages the
test cluster for examples, there is no delay between terminating
the cluster and checking for leaked goroutines. Instead, apply
Aftertest checking before running CheckLeakedGoroutine to let runtime
http goroutines finish.
2017-03-10 12:23:46 -08:00
eb6a47f87e testutil: add CheckAfterTest for calling AfterTest without a testing.T 2017-03-10 12:18:24 -08:00
52bc997e0b auth: nil check AuthInfo when checking admin permissions
If the context does not include auth information, get authinfo will
return a nil auth info and a nil error. This is then passed to
IsAdminPermitted, which would dereference the nil auth info.
2017-03-10 11:07:11 -08:00
d0d3c768d9 Merge pull request #7478 from hubt/patch-2
doc: add branch.io use case into production users
2017-03-10 10:37:59 -08:00
9c9156b478 doc: add branch.io use case into production users 2017-03-10 10:01:05 -08:00
b744cecd20 NEWS: update v3.1.3
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-10 09:19:12 -08:00
0d851e49e3 Merge pull request #7475 from xiang90/baidu
doc: fix title size
2017-03-10 09:08:21 -08:00
debeccd605 doc: fix title size 2017-03-10 09:06:25 -08:00
c848ee9d86 Merge pull request #7473 from xiang90/baidu
doc: add Baidu Waimai
2017-03-10 09:05:21 -08:00
0a692b0524 Merge pull request #7443 from fanminshi/fix_balancer_deadlock
clientv3: serialize updating notifych in balancer
2017-03-10 07:48:47 -08:00
911ae60edf doc: add Baidu Waimai 2017-03-10 07:29:21 -08:00
0c38f1ff8d Merge pull request #7469 from gyuho/manual
Documentation: add huawei product user
2017-03-09 13:09:04 -08:00
0a9e2fe1f2 Documentation: add huawei product user 2017-03-09 13:06:20 -08:00
9afe4e87fd Merge pull request #7453 from allencloud/use-case-daocloud-io
add production user daocloud
2017-03-09 12:24:15 -08:00
310641630e clientv3: send frst down() func after recieving first notified addr
This ensures the ordering of down and up calls.
2017-03-09 12:20:36 -08:00
8baaa06cce clientv3: serialize updating notifych in balancer
FIXES #7283
2017-03-09 12:20:28 -08:00
5351953425 Merge pull request #7467 from gyuho/sd-notify
etcdmain: SdNotify when gateway, grpc-proxy are ready
2017-03-09 11:23:58 -08:00
01dd60c0f7 etcdmain: SdNotify when gateway, grpc-proxy are ready
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-09 10:10:21 -08:00
ad1d48b73d Merge pull request #7014 from gyuho/auto-sync-grpc-proxy
*: register grpc-proxy members
2017-03-09 09:35:59 -08:00
c8ea343a76 Merge pull request #7463 from heyitsanthony/cov-buildi
test: install packages when building coverage tests
2017-03-09 09:15:22 -08:00
4d69d9663b Documentation/op-guide: document grpcproxy sync
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-09 02:47:24 -08:00
095407df58 etcdmain: add register,resolver flags
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-09 02:47:12 -08:00
f862b47e92 grpcproxy: configure register to Cluster API
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-09 02:40:13 -08:00
5f4412996d clientv3: define error type for closed watcher 2017-03-09 02:29:54 -08:00
ddcf14102e Merge pull request #7105 from heyitsanthony/mvcc-txn
mvcc: txns and r/w views
2017-03-09 00:25:16 -08:00
889dd1b22f clientv3util: Add KeyExists and KeyMissing examples 2017-03-09 02:34:30 -05:00
dbf654cf77 test: install packages when building coverage tests
Lots of repeated compilation. Cache results with go build -i.
2017-03-08 22:24:16 -08:00
8bc6cea90c doc: Add daocloud.io to production users
Signed-off-by: allencloud <allen.sun@daocloud.io>
2017-03-09 13:30:53 +08:00
d1dcc828c8 etcdctl: support mvcc txn 2017-03-08 20:54:15 -08:00
0ed3c83e49 benchmark: support mvcc txn 2017-03-08 20:54:15 -08:00
58da8b17ee etcdserver: support mvcc txn 2017-03-08 20:54:15 -08:00
f0c184b3a2 lease: support mvcc txn 2017-03-08 20:54:15 -08:00
33acbb694b mvcc: txns and r/w views
Clean-up of the mvcc interfaces to use txn interfaces instead of an id.

Adds support for concurrent read-only mvcc transactions.

Fixes #7083
2017-03-08 20:52:59 -08:00
8d438c2939 backend: readtx
ReadTxs are designed for read-only accesses to the backend using a
read-only boltDB transaction. Since BatchTx's are long-running
transactions, all writes to BatchTx will writeback to ReadTx, overlaying
the base read-only transaction.
2017-03-08 20:52:59 -08:00
39dc5315ed Merge pull request #7461 from heyitsanthony/fix-member-remove
e2e: don't remove member used to connect to etcd cluster
2017-03-08 20:21:57 -08:00
cd7d68fed0 Merge pull request #7458 from reterVision/patch-1
Documentation: add Grab etcd use case
2017-03-08 19:59:31 -08:00
e4f40f6554 Merge pull request #7462 from gyuho/typo
*: fix minor typos
2017-03-08 16:41:18 -08:00
beb58c434c *: fix minor typos
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-08 16:21:52 -08:00
7f94afdb8c Documentation: add Grab etcd use case 2017-03-08 16:12:32 -08:00
13e36f963d e2e: don't remove member used to connect to etcd cluster
Fixes #7204
2017-03-08 15:58:45 -08:00
e016015196 Merge pull request #7455 from gyuho/release-doc
Documentation: sign source zip files
2017-03-08 15:57:30 -08:00
1bcbd82c8b Merge pull request #7457 from gyuho/lease-guard
lease: guard 'Lease.itemSet' from concurrent writes
2017-03-08 14:48:00 -08:00
7a25257fb2 clientv3: close balancer to avoid goroutine leak in balancer_test.go 2017-03-08 13:37:18 -08:00
9713b1f3ef Merge pull request #7454 from bdudelsack/gateway-dns-discovery
gateway: fix the dns discovery method
2017-03-08 13:07:59 -08:00
234c4b1685 Documentation: sign source zip files
For https://github.com/coreos/etcd/issues/7449

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-08 11:53:52 -08:00
6f0723f23f lease: guard 'Lease.itemSet' from concurrent writes
Fix https://github.com/coreos/etcd/issues/7448.

Affected if etcd builds with Go 1.8+.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-08 11:01:42 -08:00
0d48fc5511 gateway: fix the dns discovery method
strip the scheme from the endpoints to have a clean hostname for TCP proxy

Fixes #7452
2017-03-08 19:11:55 +01:00
7f43fdde74 Merge pull request #7438 from meitu/master
Add use case in Meitu Inc.
2017-03-08 06:59:30 -08:00
3fa3d7dac6 doc: Add use case in Meitu Inc. 2017-03-08 14:27:53 +08:00
320768b2e9 Merge pull request #7435 from gnawux/use_case_hyper_sh
Add hyper.sh to production users
2017-03-07 19:14:30 -08:00
4a7b27921d doc: Add hyper.sh to production users 2017-03-08 10:45:56 +08:00
3f515e1849 Merge pull request #7441 from gyuho/warning
Documentation: warn membership change while migration
2017-03-07 11:38:28 -08:00
43eca30a08 Documentation: warn membership change while migration
Fix https://github.com/coreos/etcd/issues/7429.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-07 11:27:36 -08:00
1a3a345468 Merge pull request #7437 from hustcat/tx-case
Add Tencent Games user case
2017-03-07 09:21:19 -08:00
3eb2fdbd99 Merge pull request #6082 from mitake/auth-v3-jwt
*: support jwt token in v3 auth API
2017-03-07 16:21:57 +09:00
df657d4690 Documentation: Add Tencent Games to production users 2017-03-07 15:17:10 +08:00
7907936066 Merge pull request #7433 from nekto0n/add_production_user
Documentation: add production user
2017-03-06 22:38:01 -08:00
1fc0803840 doc: update use case of qiniu 2017-03-06 22:11:17 -08:00
382ffe679d Documentation: add production user 2017-03-07 11:10:21 +05:00
831abf82b1 doc: add usecase of qiniu 2017-03-06 21:58:02 -08:00
ed90481510 Documentation: add qingcloud to production user 2017-03-06 19:53:47 -08:00
f8a290e7ca *: support jwt token in v3 auth API
This commit adds jwt token support in v3 auth API.

Remaining major ToDos:
- Currently token type isn't hidden from etcdserver. In the near
  future the information should be completely invisible from
  etcdserver package.
- Configurable expiration of token. Currently tokens can be valid
  until keys are changed.

How to use:
1. generate keys for signing and verfying jwt tokens:
 $ openssl genrsa -out app.rsa 1024
 $ openssl rsa -in app.rsa -pubout > app.rsa.pub
2.  add command line options to etcd like below:
--auth-token-type jwt \
--auth-jwt-pub-key app.rsa.pub --auth-jwt-priv-key app.rsa \
--auth-jwt-sign-method RS512
3. launch etcd cluster

Below is a performance comparison of serializable read w/ and w/o jwt
token. Every (3) etcd node is executed on a single machine. Signing
method is RS512 and key length is 1024 bit. As the results show, jwt
based token introduces a performance overhead but it would be
acceptable for a case that requires authentication.

w/o jwt token auth (no auth):

Summary:
  Total:        1.6172 secs.
  Slowest:      0.0125 secs.
  Fastest:      0.0001 secs.
  Average:      0.0002 secs.
  Stddev:       0.0004 secs.
  Requests/sec: 6183.5877

Response time histogram:
  0.000 [1]     |
  0.001 [9982]  |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.003 [1]     |
  0.004 [1]     |
  0.005 [0]     |
  0.006 [0]     |
  0.008 [6]     |
  0.009 [0]     |
  0.010 [1]     |
  0.011 [5]     |
  0.013 [3]     |

Latency distribution:
  10% in 0.0001 secs.
  25% in 0.0001 secs.
  50% in 0.0001 secs.
  75% in 0.0001 secs.
  90% in 0.0002 secs.
  95% in 0.0002 secs.
  99% in 0.0003 secs.

w/ jwt token auth:

Summary:
  Total:        2.5364 secs.
  Slowest:      0.0182 secs.
  Fastest:      0.0002 secs.
  Average:      0.0003 secs.
  Stddev:       0.0005 secs.
  Requests/sec: 3942.5185

Response time histogram:
  0.000 [1]     |
  0.002 [9975]  |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.004 [0]     |
  0.006 [1]     |
  0.007 [11]    |
  0.009 [2]     |
  0.011 [4]     |
  0.013 [5]     |
  0.015 [0]     |
  0.016 [0]     |
  0.018 [1]     |

Latency distribution:
  10% in 0.0002 secs.
  25% in 0.0002 secs.
  50% in 0.0002 secs.
  75% in 0.0002 secs.
  90% in 0.0003 secs.
  95% in 0.0003 secs.
  99% in 0.0004 secs.
2017-03-06 19:46:03 -08:00
a7a93f54a4 vendor: import jwt-go for auth v3 2017-03-06 19:46:03 -08:00
7b1ccca373 Merge pull request #7428 from siddontang/patch-1
Documentation: add PD to production users
2017-03-06 19:31:17 -08:00
a4a84184e8 Documentation: add PD to production users 2017-03-07 09:04:52 +08:00
e5d94a296f Merge pull request #7347 from gyuho/static-check
*: add 'staticcheck' to 'test'
2017-03-06 16:20:25 -08:00
3d75395875 *: remove never-unused vars, minor lint fix
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-06 14:59:12 -08:00
bd6e6c11f8 test: run 'staticcheck'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-06 14:59:12 -08:00
79de3be6a7 Merge pull request #7430 from heyitsanthony/lock-more-deps
vendor: lock down some soft dependencies
2017-03-06 14:54:45 -08:00
db560574dd Merge pull request #7416 from heyitsanthony/test-eschew-you
test: eschew you
2017-03-06 13:30:57 -08:00
317f3571ff Merge pull request #7420 from heyitsanthony/dial-timeout-report
clientv3: pass back dial error on dial timeout
2017-03-06 12:58:18 -08:00
3f187a103b vendor: lock down some soft dependencies
Locks down:
* go-rundewidth (via tablewriter)
* golang.org/x/sys
* prometheus/{common,procfs} (via prometheus-client)
2017-03-06 12:03:45 -08:00
c8a2c7f64f *: eschew you from documentation
Removed line wrapping in affected files as well.
2017-03-06 11:40:46 -08:00
270dc9427b clientv3: pass back dial error on dial timeout
Fixes #7419
2017-03-06 09:33:10 -08:00
4e1ce81e17 test: eschew you
Per https://github.com/coreos/docs/blob/master/STYLE.md#eschew-you
2017-03-06 09:16:03 -08:00
4e2fe050f5 Merge pull request #7425 from mitake/gosimple
contrib: suppress gosimple errors of raftexample
2017-03-06 09:09:49 -08:00
b6eedbacf9 contrib: suppress gosimple errors of raftexample
Travis claimed errors of gosimple like below
(https://travis-ci.org/coreos/etcd/jobs/208098545):
gosimple checking failed:
contrib/raftexample/raftexample_test.go:78:6: should write erri := <-clus.errorC[i] instead of erri, _ := <-clus.errorC[i]
contrib/raftexample/raftexample_test.go:114:10: should write err := <-eC instead of err, _ := <-eC

This commit fixes the errors.
2017-03-06 16:17:22 +09:00
5039c7b4ab Merge pull request #7417 from purpleidea/feat/key-exists
clientv3: Add KeyExists and KeyNotExists Cmp helpers
2017-03-05 17:50:34 -08:00
8a57b90e7f Merge pull request #7422 from tmjd/docs_fix_migrate_example
etcdctl: Fix migrate example in README.md
2017-03-04 18:17:01 -08:00
9ba658f59b etcdctl: Fix migrate example in README.md 2017-03-04 19:42:27 -06:00
b68416f735 Merge pull request #7394 from gyuho/fix-advertise-client-url-host
*: use machine default host only for default value, 0.0.0.0
2017-03-03 16:35:31 -08:00
71937151d0 clientv3: Add KeyExists and KeyNotExists Cmp helpers
This is quite useful for transactions.
2017-03-03 18:45:10 -05:00
4aa68e0231 etcdmain: log machine default host after update check
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-03 14:25:39 -08:00
b7ee8f4967 embed: use machine default host only for default value, 0.0.0.0
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-03 14:25:34 -08:00
2831b9dcfd Merge pull request #7415 from gyuho/etcd-tester-lease-check-with-ttl
etcd-tester: check expired lease with -1 TTL
2017-03-03 12:49:58 -08:00
fb81fb44fa etcd-tester: check expired lease with -1 TTL
Following the change at 2ca1823a96

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-03 11:41:53 -08:00
e16db3347a Merge pull request #7413 from philips/update-etcd-integrations-and-users
production-users: add Kubernetes
2017-03-03 14:12:39 -05:00
e52f41a6d1 production-users: add Kubernetes 2017-03-03 13:09:36 -05:00
bd6f1c9e48 libraries-and-tools: rename to integrations
I want to create a more consistent naming system across the repos. Some
of our projects won't have libraries or tools (like Clair) but others
have integrated their software with Clair in various ways.

So, use a generic term: integrations.
2017-03-03 13:09:36 -05:00
85c22f4562 Merge pull request #7408 from heyitsanthony/v3-capable
api: default to V3 capability
2017-03-02 16:53:01 -08:00
42c98123b3 Merge pull request #7411 from heyitsanthony/mirror-batch
etcdctl: correctly batch revisions in make-mirror
2017-03-02 16:09:50 -08:00
ad45958841 etcdctl: correctly batch revisions in make-mirror
Fixes #7410
2017-03-02 14:30:24 -08:00
1753623f87 integration: don't set v3 capability since now default 2017-03-02 14:02:09 -08:00
5da5b834e5 api: default to V3 capability
Fixes #7154
2017-03-02 14:02:09 -08:00
9cc013fec0 Merge pull request #7409 from heyitsanthony/doc-ionice
Documentation: suggest ionice for disk tuning
2017-03-02 14:00:05 -08:00
1e252f1feb Documentation: suggest ionice for disk tuning
Also cleaned up tuning.md newlines to conform with style.
2017-03-02 13:58:07 -08:00
763aef87b9 Merge pull request #7405 from heyitsanthony/fast-gosimple
test: run unused and gosimple over all packages at once
2017-03-02 10:40:26 -08:00
6092e1ad24 Merge pull request #7403 from gyuho/do
Documentation/op-guide: use exact certs dir for Container Linux
2017-03-02 10:33:42 -08:00
ae0c4b4c87 Documentation/op-guide: use exact certs dir for Container Linux
Use the one that works in Container Linux

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-02 10:20:59 -08:00
3296c15a32 test: run unused and gosimple over all packages at once
fmt pass went from ~20 CPU minutes to ~1 CPU minute.

Fixes #7377
2017-03-02 10:17:46 -08:00
5cdb557560 Merge pull request #7390 from fanminshi/put_ctl_warning
etcdctl: show warning if ETCDCTL_API is not set
2017-03-02 10:17:19 -08:00
db91277216 Merge pull request #7400 from heyitsanthony/fix-example-ctx
clientv3: bump example requestTimeout for slow CI
2017-03-01 21:57:34 -08:00
2eb8243d94 Merge pull request #7402 from heyitsanthony/fix-watchconnerr
grpcproxy: return closing error when stream is canceled from conn close
2017-03-01 21:56:36 -08:00
134d1cb4e0 Merge pull request #7404 from xiang90/nt
raft: make TestNodeTick reliable
2017-03-01 20:02:25 -08:00
931cf3454a raft: make TestNodeTick reliable
TestNodeTick relies on a unreliable func `waitForSchedule` when running
with GOMAXPROCS > 1. This commit changes the test to make sure we stop
the node afte it drains the tick chan. The test should be reliable now.
2017-03-01 17:35:58 -08:00
010cc287bb Merge pull request #7401 from gyuho/docker-guide
op-guide: add notes on mounting certs directory
2017-03-01 16:50:24 -08:00
28e9ba365a grpcproxy: return closing error when stream is canceled from conn close
Fixes #6630
2017-03-01 16:46:13 -08:00
d111c8fe3b op-guide: add notes on mounting certs directory
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-01 16:45:03 -08:00
cf547aa403 clientv3: bump example requestTimeout for slow CI
Fixes #7398
2017-03-01 14:37:40 -08:00
f76ca01aed etcdctl: show warning if ETCDCTL_API is not set in v2 --help
FIXES #7385
2017-03-01 11:29:59 -08:00
d3aebbf0ce Merge pull request #7387 from fanminshi/rework_coverage_ctl
e2e: rework coverage ctl
2017-03-01 10:01:13 -08:00
edd298f85a test: build test binary for etcdctl 2017-02-28 17:08:24 -08:00
1f413cff64 e2e: add etcdctl to e2e test 2017-02-28 17:08:17 -08:00
aca4ea2a29 etcdctl: modify etcdctl v2 and v3 for code coverage 2017-02-28 14:50:27 -08:00
17ae440991 Merge pull request #7379 from fanminshi/fix_TestRestartMember
integration: integration: ensure leader is up in waitLeader() and clusterMustProgress()
2017-02-28 10:57:49 -08:00
324d2383b8 integration: ensure leader is up in waitLeader() and clusterMustProgress()
The issue is caused by leader loss even after waitLeader() returns
which can happen if the test machine is flaky which triggers a leader loss
or the killed node is the leader since waitLeader() only scans followers in
TestRestartMember() and they can have the same older leader.
In those cases, clusterMustProgress() proceeds with no leader which triggers
the no leader error.

To get around that, use linearizable get in waitLeader() to ensure leader is up
and retries on kapi.create() in clusterMustProgress() to ensure it proceeds with
a leader.

FIX #7258
2017-02-28 09:17:03 -08:00
1a9cd7bf36 Merge pull request #7294 from mkumatag/ppc64le_docker
Build docker image for ppc64le
2017-02-28 09:04:03 -08:00
8f744fe46b Merge pull request #7384 from heyitsanthony/debug-grpc-tracing
etcdmain: enable grpc tracing with --debug
2017-02-27 16:05:04 -08:00
633cfbe241 script: Build docker image for ppc64le 2017-02-27 19:04:32 -05:00
bbd8f4e6f6 Merge pull request #7386 from heyitsanthony/doc-lease-coalesce
Documentation: add documentation for grpc lease stream coalescing
2017-02-27 15:30:45 -08:00
22f0386683 Documentation: add documentation for grpc lease stream coalescing 2017-02-27 14:45:01 -08:00
c4f1e64de7 embed: enable debug endpoint if Debug is set and add net.trace events
/debug/ was only being enabled for Pprof.
2017-02-27 11:34:58 -08:00
298d58841e etcdmain: enable grpc tracing with --debug 2017-02-27 11:18:13 -08:00
01557ebc8f Merge pull request #7376 from heyitsanthony/fix-example-metrics-port
clientv3: use any port for metrics example
2017-02-24 16:24:52 -08:00
c231950cdb clientv3: use any port for metrics example
Was getting bind conflicts causing failures on semaphore.
2017-02-24 14:33:08 -08:00
15d8ca7726 Merge pull request #7375 from heyitsanthony/fix-e2e-cov
e2e: fix -tags cov builds
2017-02-24 13:01:11 -08:00
2ec8572a8c e2e: fix -tags cov builds
Wasn't compiling.
2017-02-24 09:47:31 -08:00
833aa518d8 Merge pull request #7372 from gyuho/updates
*: miscellaneous updates on release 3.2 cycle
2017-02-23 17:42:35 -08:00
9fbdd0a84a Merge pull request #7373 from gyuho/news
NEWS: add v3.1.2 release notes
2017-02-23 17:19:31 -08:00
3eaf2f6558 *: remove trailing space, upgrade test on v3.1 2017-02-23 16:19:24 -08:00
119d0520c6 NEWS: add v3.1.2 release notes 2017-02-23 15:02:36 -08:00
3f756d502b travis: use Go 1.8 in master branch 2017-02-23 14:38:47 -08:00
9d74eb5c60 MAINTAINERS: add Fanmin 2017-02-23 14:38:14 -08:00
86c9bf5c3f Merge pull request #7371 from gyuho/grpc-proxy-register
grpcproxy: add 'register' address
2017-02-22 17:30:38 -08:00
72a531e8b2 grpcproxy: add 'register' address
For https://github.com/coreos/etcd/issues/6902.
2017-02-22 16:47:48 -08:00
5df56fa615 Merge pull request #7366 from heyitsanthony/fix-watch-stream-counting
integration: permit background watch streams in TestWatchCancelOnServer
2017-02-22 10:36:36 -08:00
df3bb333ca Merge pull request #7368 from heyitsanthony/fix-netutil-ipv4
netutil: use ipv4 host by default
2017-02-22 09:57:56 -08:00
c3a678be75 integration: permit background watch streams in TestWatchCancelOnServer
Fixes #7272
2017-02-22 09:54:08 -08:00
c0c4c7cb76 Merge pull request #7364 from gyuho/auth-revi
auth: keep old revision in 'NewAuthStore'
2017-02-22 09:40:01 -08:00
f97a077257 netutil: use ipv4 host by default
Was non-deterministic.
2017-02-21 20:11:35 -08:00
f2e9936de5 integration: add 'TestV3HashRestart' 2017-02-21 16:20:56 -08:00
6431382a75 auth: keep old revision in 'NewAuthStore'
When there's no changes yet (right after auth
store initialization), we should commit old revision.

Fix https://github.com/coreos/etcd/issues/7359.
2017-02-21 16:18:47 -08:00
c90c757a56 Merge pull request #7355 from hhkbp2/fix-test-case-typo
raft: revise test case and fix typo
2017-02-21 16:15:22 -08:00
dd16463ad4 Merge pull request #7363 from heyitsanthony/fix-short-lease-ttl
clientv3: do not set next keepalive time <= now+TTL
2017-02-21 15:36:37 -08:00
25403970f5 Merge pull request #7361 from heyitsanthony/fix-gateway-goroutine
tcpproxy: don't use range variable in reactivate goroutine
2017-02-21 13:26:53 -08:00
12d3e4e473 integration: test keepalives for short TTLs 2017-02-21 13:15:45 -08:00
3c306cdb3e clientv3: do not set next keepalive time <= now+TTL 2017-02-21 13:15:45 -08:00
8b097f279d tcpproxy: don't use range variable in reactivate goroutine
Ends up trying to reactivate only the last endpoint.
2017-02-21 12:39:49 -08:00
0c0fbbd7c5 Merge pull request #7342 from heyitsanthony/client-version
clientv3: version checking
2017-02-21 11:46:48 -08:00
3c20bdd004 Merge pull request #7345 from heyitsanthony/fix-stream-err
grpcproxy: only return ctx error in chan stream if recvc is empty
2017-02-21 11:08:49 -08:00
86cb9f2490 Merge pull request #7358 from gyuho/manual
clientv3: fix typo in README
2017-02-21 11:03:49 -08:00
29a6fd65ad grpcproxy: only return ctx error in chan stream if recvc is empty
Since select{} won't prioritize, ctx.Done() can sometimes override
a pending message on recvc. Loop if recvc has messages instead.

Fixes #7340
2017-02-21 10:53:58 -08:00
56b4e6b71f clientv3: fix typo in README
Fix https://github.com/coreos/etcd/issues/7337
2017-02-21 10:33:17 -08:00
2ac44eab81 Merge pull request #7341 from gyuho/host
op-guide: use host volume in Docker command
2017-02-21 10:31:50 -08:00
6f193ea1df op-guide: use host volume in Docker command 2017-02-21 10:28:29 -08:00
4e114d3549 Merge pull request #7351 from davecheney/fixedbugs/7350
pkg/transport: remove dependency on pkg/fileutils
2017-02-21 09:21:53 -08:00
bc6bebe7b0 raft: revise test case and fix typo 2017-02-21 15:23:42 +08:00
9b84127739 pkg/transport: remove dependency on pkg/fileutils
4a0f922 changed SelfCert to use a helper from pkg/fileutils which
introduced a transitive dependency on coreos/pkg/capnslog. This means
anyone who imports pkg/transport to use TLS with the clientv3 library
has the default stdlib logger hijacked by capnslog.

This PR reverts 4a0f922. There are no tests because 4a0f922 contained no
test and was not attached to a PR.

Fixes #7350
2017-02-20 12:32:04 +11:00
2533c2a50c Merge pull request #7254 from fanminshi/rework_coverage_e2e
e2e: add code coverage to e2e
2017-02-17 15:51:47 -08:00
f203a61469 e2e: unshadow err and remove bogus err checking in spawnWithExpects() 2017-02-17 14:47:24 -08:00
07129a6370 *: add and expose StopSignal field in ExpectProcess
add and expose StopSignal to ExpectProcess allows user
to define what signal to send on ExpectProcess.close()

coverage testing code sets StopSignal to SIGTERM allowing
the test binary to shutdown gracefully so that it can generate
a coverage report.
2017-02-17 14:47:06 -08:00
78fbe669ad Merge pull request #7332 from hhkbp2/fix-read-index
raft: fix read index request for #7331
2017-02-17 14:27:42 -08:00
b5be18a744 test: add e2e to coverage test 2017-02-17 14:15:26 -08:00
51435df179 integration: test RejectOldCluster 2017-02-16 21:33:14 -08:00
4d2aa80ecf clientv3: add cluster version checking 2017-02-16 18:14:14 -08:00
c9452c6ad4 clientv3: let user provide a client context through Config 2017-02-16 18:14:14 -08:00
9342647e0c raft: fix read index request for #7331 2017-02-17 09:45:41 +08:00
a5cf7fdc87 Merge pull request #7221 from fanminshi/grpcproxy_support_lease_coalescing
grpcproxy: support lease coalescing
2017-02-16 13:42:49 -08:00
507bd2ab4b Merge pull request #7339 from xiang90/fix_l
clientv3: fix lease keepalive duration
2017-02-16 13:35:27 -08:00
4fb8d30f0a clientv3: fix lease keepalive duration 2017-02-16 12:04:07 -08:00
5d3597a5f2 Merge pull request #7338 from xiang90/fix_l
clientv3: fix lease keepalive duration
2017-02-16 11:58:10 -08:00
65b59f4423 grpcproxy: incorporate lease proxy into existing proxy framework 2017-02-16 11:50:59 -08:00
ba52bd07ba grpcproxy: add lease coalescing support 2017-02-16 11:50:50 -08:00
05b82f2022 grpcproxy: refactor chan stream out of watch_client_adapter 2017-02-16 11:41:21 -08:00
4274db46f2 clientv3: fix lease keepalive duration 2017-02-16 11:25:26 -08:00
49a12371c1 Merge pull request #7335 from heyitsanthony/leadership-kick
grpcproxy: support forcing leader as available
2017-02-16 09:40:08 -08:00
4608210154 Documentation/libraries-and-tools: add vitess 2017-02-15 21:35:19 -08:00
80de75431e grpcproxy: support forcing leader as available
Leadership timeout can sometimes take too long, such as in test cases.
However, it is possible to infer a leader is available based on RPCs
that must go through consensus. Therefore, have a way to update the
leadership status off the watch path.
2017-02-15 16:49:41 -08:00
2510a1488c Merge pull request #7327 from heyitsanthony/fix-runtime-conf-doc
op-guide: fix remove instructions in runtime-configuration and conform to style
2017-02-15 10:22:47 -08:00
80ab321f9d etcdmain: whitelist etcd binary flags 2017-02-15 09:51:50 -08:00
1d521556ae e2e: modify e2e to run code coverage 2017-02-15 09:51:50 -08:00
2f8b9ce9aa Merge pull request #7314 from heyitsanthony/fix-leadership
grpcproxy: split out / tighten up leadership detection
2017-02-15 07:01:38 -08:00
a4a8393cb7 integration: wait five elections before creating watch for require leader test
Otherwise new watch will race with the leader watcher receiving the loss event.
2017-02-15 00:16:25 -08:00
36f5b713bf grpcproxy: don't wait for ctx.Done() to close kv donec
Causes a goroutine leak in ActiveConnection.Close() tests. Channel is
vestigial since removing ccache; revisit if kv ever needs goroutines.
2017-02-15 00:16:25 -08:00
49a0a63fc3 grpcproxy: split out leadership detection code
Move out of watch code since will be shared with lease code. Also assumes
leader does not exist unless watch can be successfully created.
2017-02-15 00:16:25 -08:00
ad1b754e02 Merge pull request #7330 from fanminshi/fix_keepAliveOnce
clientv3: KeepAliveOnce returns ErrLeaseNotFound if TTL <= 0
2017-02-14 15:42:18 -08:00
8cb5e05fc9 clientv3: KeepAliveOnce returns ErrLeaseNotFound if TTL <= 0 2017-02-14 15:19:29 -08:00
67e3fc55d7 op-guide: fix remove instructions in runtime-configuration and conform to style
Fixes #7326
2017-02-14 13:41:51 -08:00
78d153fc5a Merge pull request #7328 from heyitsanthony/travis-spam
travis: disable email notifications
2017-02-14 12:33:32 -08:00
2cc273291d travis: disable email notifications
Was spamming security@coreos.com
2017-02-14 12:08:49 -08:00
808ee4e57c Merge pull request #7313 from gyuho/simplify-auth
auth: simplify merging range perm
2017-02-14 14:18:06 +09:00
3d994f8653 Merge pull request #7317 from petermattis/pmattis/ready-must-sync
raft: add Ready.MustSync
2017-02-13 17:53:08 -08:00
c200be6432 Merge pull request #7319 from heyitsanthony/fix-compact-watch
grpcproxy: respect CompactRevision in watcher
2017-02-13 16:46:34 -08:00
e0ddded077 auth: simplify merging range perm
No need of separate function to filter duplicates.
Just merge ranges in-place

```
go test -v -run=xxx -bench=BenchmarkMergeOld -benchmem
BenchmarkMergeOld-8   	  100000	     13524 ns/op	    1104 B/op	       8 allocs/op

go test -v -run=xxx -bench=BenchmarkMergeNew -benchmem
BenchmarkMergeNew-8   	  100000	     13432 ns/op	     936 B/op	       3 allocs/op
```

Not much performance boost, but less memory allocation
and simpler
2017-02-13 16:37:43 -08:00
853f68071b grpcproxy: respect CompactRevision in watcher
CompactRevision wasn't sent over watch stream, causing TestKVCompact to hang.
2017-02-13 15:43:41 -08:00
43740a8d3c Merge pull request #7318 from heyitsanthony/limit-doc
etcdserverpb, clientv3: clarify WithLimit documentation
2017-02-13 15:35:37 -08:00
e52a985a3a Merge pull request #7307 from heyitsanthony/proxy-countonly
grpcproxy: support CountOnly
2017-02-13 13:30:31 -08:00
fb7dd0f688 etcdserverpb, clientv3: clarify WithLimit documentation
Fixes #7316
2017-02-13 12:37:44 -08:00
ab03a42f06 raft: add Ready.MustSync
Add Ready.MustSync which indicates that the hard state and raft log
entries in a Ready message must be synchronously written to persistent
storage.
2017-02-13 15:13:21 -05:00
2925f02aac Merge pull request #7305 from fanminshi/return_header_for_timetolive
lease: LeaseTimeToLive returns TTL=-1 resp on lease not found
2017-02-13 11:24:36 -08:00
0d08ffa282 integration: don't expect lease not found error for TestV3GetNonExistLease 2017-02-10 17:35:43 -08:00
bcfbb096e2 clientv3/integration: test lease not found on TimeToLive() 2017-02-10 16:41:47 -08:00
2ca1823a96 v3rpc: LeaseTimeToLive returns TTL=-1 resp on lease not found 2017-02-10 16:33:31 -08:00
c22ba766d5 grpcproxy: support CountOnly
TestKVRange from client integration tests was failing.
2017-02-10 16:06:24 -08:00
9f8e82e1c0 Merge pull request #7304 from heyitsanthony/remove-ccache
Remove ccache
2017-02-10 16:02:31 -08:00
1fe2a9b124 Revert "Merge pull request #7139 from heyitsanthony/proxy-rlock"
This reverts commit 304606ab0b, reversing
changes made to 7dfe503f1c.
2017-02-10 14:37:48 -08:00
47cb8a012a Merge pull request #7301 from ghostplant/master
Fix a command error.
2017-02-10 09:31:22 -08:00
cc14f14216 Documentation: replace px typo with ps
Signed-off-by: CUI Wei <ghostplant@qq.com>
2017-02-11 00:23:37 +08:00
1a4a4fa7ac Merge pull request #7295 from mkumatag/fix_gosimple
test: Fix gosimple errors
2017-02-09 07:39:55 -08:00
98249bc950 Merge pull request #7297 from mkumatag/update_travis
travis: Update fmt check gotools
2017-02-09 07:26:00 -08:00
5afa4e4fdf travis: Update fmt check gotools 2017-02-09 10:17:36 -05:00
0914b8b707 test: Fix gosimple errors
Getting gosimple suggestion while running test script, so this PR is for fixing gosimple S1019 check.
raft/node_test.go:456:40: should use make([]raftpb.Entry, 1) instead (S1019)
raft/node_test.go:457:49: should use make([]raftpb.Entry, 1) instead (S1019)
raft/node_test.go:458:43: should use make([]raftpb.Message, 1) instead (S1019)

Refer https://github.com/dominikh/go-tools/blob/master/cmd/gosimple/README.md#checks for more information.
2017-02-09 08:01:28 -05:00
c4fc8c0989 Merge pull request #7260 from mitake/auth-state
auth: correct initialization in NewAuthStore()
2017-02-08 18:11:13 -08:00
9b72c8ba1b Merge pull request #7285 from fanminshi/uses_direct_client_call_for_tests
clientv3: integration test uses direct client calls
2017-02-07 12:09:37 -08:00
366e689eae clientv3: uses direct client calls in integration tests
clientv3 integration test was using clientv3.NewKV, clientv3.NewWatcher, etc to create specific client.
replace those with direct client calls so that the direct calls can also test grpc proxy.
2017-02-07 11:09:19 -08:00
0944a50d3f Merge pull request #7288 from fanminshi/fix_TestLeaseKeepAliveInitTimeout_test
clientv3/integration:  stop member before keepalive in TestLeaseKeepAliveInitTimeout
2017-02-07 10:48:54 -08:00
c182428e52 clientv3/integration: stop member before keepalive in TestLeaseKeepAliveInitTimeout 2017-02-07 10:07:03 -08:00
bf5ecf6555 Merge pull request #7262 from mkumatag/ppc64le_binary
scripts: Add support to build ppc64le binary for release
2017-02-07 09:52:12 -08:00
cf5cc18f02 Merge pull request #7286 from heyitsanthony/lease-snip-cancel-stop
clientv3: remove cancelWhenStop from lease implementation
2017-02-07 09:12:34 -08:00
a213b3abf5 clientv3: remove cancelWhenStop from lease implementation
Only have Close() cancel out outstanding goroutines. Canceling out
single-shot RPCs will mask connection close on client.Close().
2017-02-06 17:21:46 -08:00
739accc242 Merge pull request #7281 from heyitsanthony/no-default-ka
clientv3: only start lease stream after first keepalive call
2017-02-06 13:51:43 -08:00
a9f10bdeee clientv3: only start lease stream after first keepalive call
Fixes #7274
2017-02-06 11:52:57 -08:00
9976d869c1 auth: correct initialization in NewAuthStore()
Because of my own silly mistake, current NewAuthStore() doesn't
initialize authStore in a correct manner. For example, after recovery
from snapshot, it cannot revive the flag of enabled/disabled. This
commit fixes the problem.

Fix https://github.com/coreos/etcd/issues/7165
2017-02-06 16:05:49 +09:00
280b65fe4d auth: add a test case for recoverying from snapshot 2017-02-06 15:42:09 +09:00
6fb99a8585 Merge pull request #7276 from fanminshi/fix_lease_keep_alive_loop
clientv3: sends keepalive reqs immediately after lease keep alive stream reset
2017-02-04 21:28:56 -08:00
4d055ca73b Merge pull request #7277 from gyuho/second-point
pkg/report: add min/max latency per second
2017-02-04 12:47:12 -08:00
950a9da9d9 pkg/report: add min/max latency per second
For https://github.com/coreos/dbtester/issues/221.
2017-02-04 12:46:54 -08:00
720234d32b clientv3: sends keepalive reqs immediately after lease keep alive stream reset
when lease client reset lease keep alive stream, sendKeepAliveLoop() should send out keep alive reqs immediately instead of waiting for 500ms.
2017-02-03 16:36:24 -08:00
23b5a29101 Merge pull request #7273 from heyitsanthony/snip-prom
clientv3: add DialOptions to config
2017-02-03 15:54:20 -08:00
8c43bd06a0 clientv3: add DialOptions to config
Removes strict prometheus dependency.

Fixes #7058
2017-02-03 12:00:20 -08:00
4203c766fb Merge pull request #7270 from gyuho/pkg
pkg/netutil: name GetDefaultInterfaces consistent
2017-02-03 08:06:15 -08:00
01a1dae7ae pkg/netutil: name GetDefaultInterfaces consistent 2017-02-03 00:37:31 -08:00
d159353d51 Merge pull request #7268 from heyitsanthony/proxy-test-clientv3
test: add proxy tests for clientv3 integration tests
2017-02-02 20:31:05 -08:00
ae5c89ff12 Merge pull request #7266 from heyitsanthony/snip-yaml
clientv3: remove strict yaml dependency
2017-02-02 16:07:12 -08:00
56c706ff91 Merge pull request #7269 from sinsharat/use_requestWithContext_for_cancel
*: Use http.Request.WithContext instead of Cancel
2017-02-02 09:53:09 -08:00
e42fa18ccf grpcproxy: don't use WithRequireLeader for watch event stream
Ohterwise leader loss will reject all stream creation.
2017-02-02 09:32:25 -08:00
9def4cb9fe *: Use http.Request.WithContext instead of Cancel 2017-02-02 22:50:07 +05:30
e3f4b43614 test: clientv3 integration tests with proxy 2017-02-01 22:04:18 -08:00
b465b48476 clientv3: remove strict yaml dependency
Moved to clientv3/yaml
2017-02-01 21:02:45 -08:00
42e7d4d09d Merge pull request #7255 from sinsharat/use_requestWithContext_for_cancel
rafthttp: use http.Request.WithContext instead of Cancel
2017-02-01 15:49:43 -08:00
f74142187d Merge pull request #7263 from Rushit/test_isadminpermited
auth: test for AuthStore.IsAdminPermitted
2017-02-01 13:46:31 -08:00
2656b594bb rafthttp: use http.Request.WithContext instead of Cancel 2017-02-02 02:30:36 +05:30
5d41e7f09b scripts: Add support to build ppc64le binary for release 2017-02-02 00:45:56 +05:30
beef5eea37 auth: test for AuthStore.IsAdminPermitted
This will cover test for AuthStore.IsAdminPermitted in store.go
2017-02-01 08:39:09 -08:00
0df1822212 Merge pull request #7257 from Rushit/auth_test
auth: unit-test for authStore.AuthDisable()
2017-01-31 20:45:39 -08:00
46cac6f292 auth: unit-test for authStore.AuthDisable()
This will cover unit-test for AuthDisable in store.go
2017-01-31 18:18:56 -08:00
89bb9048dd Merge pull request #6881 from mitake/auth-v3-cn
authenticate clients based on certificate CommonName in v3 API
2017-01-31 17:21:53 -08:00
c6e9892af4 Merge pull request #7256 from Felixoid/issue_7219
netutil: add dualstack to linux_route
2017-01-31 16:49:40 -08:00
0f53ad0b84 netutil: add dualstack to linux_route
in v3.1.0 netutil couldn't get default interface for ipv6only hosts

Fixes #7219
2017-01-31 22:19:47 +03:00
cd9f0a1721 e2e: add a case for CommonName auth of v3 API 2017-01-31 17:22:12 +09:00
0191509637 auth, etcdserver: authenticate clients based on certificate CommonName
This commit lets v3 auth mechanism authenticate clients based on
CommonName of certificate like v2 auth.
2017-01-31 17:22:12 +09:00
7d6280fa82 Merge pull request #7248 from ravigadde/session-w-lease
clientv3: start a session with existing lease
2017-01-30 20:12:23 -08:00
c586218ec6 clientv3: start a session with existing lease
This change is needed to handle process restarts with elections. When the
leader process is restarted, it should be able to hang on to the leadership
by using the existing lease.

Fixes #7166
2017-01-30 18:07:22 -08:00
d2716fc5ae Merge pull request #7238 from mkumatag/support_ppc64le
ppc64le platform support
2017-01-26 21:16:33 -08:00
9767098331 etcdmain: ppc64le platform support 2017-01-26 21:08:07 -08:00
f127f462c6 Merge pull request #7229 from Rushit/auth-tests
auth: Adding unit tests
2017-01-27 11:52:03 +09:00
75ae50a90f Merge pull request #7243 from gyuho/doc
contrib: add etcd cluster deploy on systemd docs
2017-01-26 17:04:21 -08:00
5dace5f6dc Merge pull request #7242 from xiang90/fix_test
e2e: do not remove the member we connect to
2017-01-26 16:59:03 -08:00
19d30fd4a7 contrib: add etcd cluster deploy on systemd docs
Fix https://github.com/coreos/etcd/issues/5971
2017-01-26 16:56:55 -08:00
78540c5e7b e2e: do not remove the member we connect to 2017-01-26 15:43:27 -08:00
3351a71e84 Merge pull request #7240 from fanminshi/balancer_fix
clientv3: fix balancer update address bug
2017-01-26 15:08:50 -08:00
ae2e8fa462 Merge pull request #7241 from nmiyake/fixTestMessage
test: fix failure message in TestEmbedEtcd
2017-01-26 14:42:02 -08:00
18af48a9dc integration: add test case in dial_test to ensure balancer.updateAddrs works properly 2017-01-26 14:21:29 -08:00
e3b325c196 test: fix failure message in TestEmbedEtcd 2017-01-26 14:00:32 -08:00
9dbde1cc52 Merge pull request #7236 from heyitsanthony/no-dns-bind
embed: reject domain names before binding (again)
2017-01-26 13:52:30 -08:00
0c4e67c1f4 clientv3: fix balancer update address bug 2017-01-26 13:33:10 -08:00
5a67b0aba6 embed: reject binding listeners to domain names
Fixes #6336
2017-01-26 12:37:34 -08:00
63572567b4 integration: test domain name URLs are rejected before binding 2017-01-26 12:37:34 -08:00
54cf0317c3 Merge pull request #7237 from heyitsanthony/bump-e2e
test: bump e2e timeout to 15 minutes
2017-01-26 12:32:40 -08:00
b1b78c537c auth: Adding unit tests
This covers tests for User and Role related operations.
This tests brings code coverage in store.go from 40.2% to 72.1%.
2017-01-26 09:03:52 -08:00
6838ac3ba5 Merge pull request #7234 from Rushit/store_test_refactoring
auth: refactor auth store test to use common setup
2017-01-26 14:52:17 +09:00
072eda508b test: bump e2e timeout to 15 minutes
PPC64 timing out; integration tests already at 15 minutes.
2017-01-25 20:56:31 -08:00
fa1cbd5890 auth: refactor test to use common setup
Refactored tests to pull common setup into a method.
2017-01-25 19:07:15 -08:00
094be295a1 Merge pull request #7227 from heyitsanthony/clientv3-dial-ctx
clientv3: use DialContext
2017-01-25 13:28:29 -08:00
56286ccd29 clientv3: use DialContext
Fixes #7216
2017-01-25 09:49:41 -08:00
a2c44a8b65 clientv3: test closing client cancels blocking dials 2017-01-25 09:49:41 -08:00
11619f8db2 Merge pull request #7233 from rlenferink/documentation
Documentation: Deleted non-existing project from libraries-and-tools.md
2017-01-25 09:06:48 -08:00
eb42a5cb2f Documentation: Deleted non-existing project from libraries-and-tools.md 2017-01-25 11:43:56 -05:00
55c98982d1 Merge pull request #7231 from rlenferink/documentation
Documentation: C library added
2017-01-25 06:56:49 -08:00
10a401c7c6 Documentation: C library added 2017-01-25 15:10:48 +01:00
fb7365ef3c Merge pull request #7230 from Rushit/gitignore
.gitignore: Adding .idea to .gitignore
2017-01-24 22:25:52 -08:00
af20ba21cb .gitignore: Adding .idea to .gitignore
This will keep  all intellij IDEA IDE related files out of git.
This helps contributors using IDEA IDE for development.
2017-01-24 22:14:20 -08:00
6bef2bddca Merge pull request #7215 from disksing/grpc-service
embed: support user defined grpc services.
2017-01-24 15:02:43 -08:00
d7cc9be3fd Merge pull request #7214 from sinsharat/support_put_ignore_lease
*: 'ignore_lease' to detach value with PutRequest
2017-01-24 14:53:51 -08:00
2fce80e4c0 grpcproxy: handle 'IgnoreLease' field in PutRequest 2017-01-25 03:14:31 +05:30
37fb2c454f e2e: test put command with '--ignore-lease' flag 2017-01-25 03:12:48 +05:30
84a81d8caf ctlv3: add '--ignore-lease' flag to put command 2017-01-25 03:11:19 +05:30
d3191d1afb clientv3: add WithIgnoreLease option 2017-01-25 03:09:30 +05:30
95edd1bc58 integration: put,txn with 'ignore_lease' flag 2017-01-25 03:07:23 +05:30
8a87769a09 etcdserver: use prev-lease for 'ignore_lease' writes 2017-01-25 03:05:55 +05:30
5ac4e4255a v3rpc: error for non empty lease with 'ignore_lease' 2017-01-25 03:04:07 +05:30
508c9dfe5c *: regenrate proto files with 'ignore_lease' 2017-01-25 03:01:47 +05:30
a9bf593bdc *: 'ignore_lease' to detach value with PutRequest 2017-01-25 02:59:30 +05:30
90f6a4a28d Merge pull request #7226 from gyuho/vendor
ctlv3: right-align table output, fix typo in vendor
2017-01-24 12:03:58 -08:00
ce9f73a34c ctlv3: right-align the table output 2017-01-24 11:41:47 -08:00
a674116f07 vendor: update tablewritier 2017-01-24 11:41:47 -08:00
b9bbfda874 Merge pull request #7210 from jimmycuadra/rust-etcd
Add rust-etcd to the list of libraries.
2017-01-24 07:53:15 -08:00
13b70ed545 tools: add rust-etcd to the list of libraries. 2017-01-24 01:41:37 -08:00
0fea49f8fa Merge pull request #7222 from gyuho/manual
client: add GetVersion method
2017-01-23 19:14:49 -08:00
02f4a9a034 client: add GetVersion method
For retrieving etcdserver and etcdcluster version
2017-01-23 18:52:39 -08:00
ab532524b0 Merge pull request #7206 from heyitsanthony/redoc-rangerequest
etcdserverpb: rework documentation for range request
2017-01-23 13:08:04 -08:00
aace95a5bd Merge pull request #7209 from heyitsanthony/etcdmain-help
etcdmain: add gateway and grpc-proxy commands to etcd help
2017-01-23 13:07:30 -08:00
e75f52b97a Merge pull request #7220 from vimalk78/fix-recipes-newSequentialKV-comment
contrib/recipes/key.go : fixed comment in method newSequentialKV
2017-01-23 12:47:48 -08:00
6443c25422 contrib/recipes/key.go : fixed method comment 2017-01-23 23:42:50 +05:30
e0f4dd4cca Merge pull request #7079 from heyitsanthony/stm-prefetch
STM: prefetch and more
2017-01-23 09:45:36 -08:00
8b952fb8dc Merge pull request #7217 from xiang90/doc
doc: mention HTTP JSON in doc link
2017-01-23 09:41:57 -08:00
5aab92414f Merge pull request #7199 from heyitsanthony/netutil-test-arch
pkg/netutil: use native byte ordering
2017-01-23 09:41:20 -08:00
861cb5cfa2 embed: add example for ServiceRegister. 2017-01-23 10:47:01 +08:00
165c77f14e doc: mention HTTP JSON in doc link
It is not clear to users immediately what is the gRPC
gateway. Adding a more explaination to make it clear that
etcd3 supports HTTP API through the gateway.
2017-01-22 10:55:21 -08:00
4374d944d4 embed: support user defined grpc services.
Fixes #7200
2017-01-22 18:21:19 +08:00
89f7cc51fa Merge pull request #7155 from andrewstuart/upgrade-etcd3-docs
Documentation: add upgrade gotchas/further info for better visibility on google, etc
2017-01-21 12:56:07 -08:00
deb11b3594 Documentation: Add upgrade gotchas/further info for better search visibility 2017-01-21 13:46:27 -07:00
555b8047e6 integration: fix STM tests to compile against new interface 2017-01-20 16:30:58 -08:00
13420b33a0 benchmark: update for new stm interface 2017-01-20 16:22:43 -08:00
8695511153 concurrency: STM snapshot isolation level 2017-01-20 16:22:43 -08:00
8604d1863b concurrency: STM WithPrefetch option
Fixes #6923
2017-01-20 16:22:42 -08:00
a81234a25b concurrency: extend STM interface to Get from any of a list of keys
Now possible to fetch multiple keys in a single txn.
2017-01-20 16:22:42 -08:00
59880a0ab8 concurrency: variadic stm options
Makes txn isolation and the context variadic options.
2017-01-20 16:22:42 -08:00
7e31ddd32a etcdserverpb: rework documentation for range request 2017-01-20 16:12:09 -08:00
dfb2ed07db etcdmain: add gateway and grpc-proxy commands to etcd help 2017-01-20 15:54:13 -08:00
92cec10103 Merge pull request #7208 from gyuho/REAMDE
README: remove ACI, update Go version
2017-01-20 14:19:58 -08:00
074af101f1 Merge pull request #7207 from xiang90/roadmap
roadmap: update roadmap
2017-01-20 13:58:20 -08:00
aa79523d33 roadmap: update roadmap 2017-01-20 13:50:36 -08:00
367b064bcd README: remove ACI, update Go version 2017-01-20 13:42:11 -08:00
3196d08c7a Merge pull request #7205 from gyuho/doc
op-guide: change grpc-proxy from 'pre' to alpha'
2017-01-20 13:23:01 -08:00
b788790e56 op-guide: change grpc-proxy from 'pre' to alpha' 2017-01-20 13:20:32 -08:00
71c45906ad Merge pull request #7196 from heyitsanthony/build-goget
documentation: update build documentation
2017-01-20 11:19:50 -08:00
a630735c29 Merge pull request #7170 from vimalk78/make-v2-endpoint-optional-#7100
embed/etcd.go: make v2 endpoint optional. fixes #7100
2017-01-20 11:14:20 -08:00
5d8cceb164 Merge pull request #7183 from vimalk78/fix-test-script-function-arguments
test: passed the test script arguments to the function parameters
2017-01-20 11:09:09 -08:00
06a27d8590 documentation: update build documentation 2017-01-20 11:04:40 -08:00
1ada4f939f pkg/netutil: use native byte ordering for route information
Fixes #7199
2017-01-20 10:44:05 -08:00
9d0d4be7d1 Merge pull request #7203 from xiang90/fix_snap
etcdctlv3: snapshot restore works with lease key
2017-01-20 10:05:22 -08:00
3902d5ab0a Merge pull request #7195 from gyuho/fix-stm-restart
concurrency: fix stm restart on concurrent key deletion
2017-01-20 09:49:16 -08:00
96e0f50673 etcdctlv3: snapshot restore works with lease key 2017-01-20 09:37:39 -08:00
82d56b6314 Merge pull request #7197 from vimalk78/fix-ETCD-prefix-check
pkg/flags: fixed prefix checking of the env variables
2017-01-20 08:17:02 -08:00
e446c2c2c7 pkg/cpuutil: add cpuutil
A package for unsafe cpu-ish things.
2017-01-20 01:47:56 -08:00
e4b8c874d2 pkg/flags: fixed prefix checking of the env variables 2017-01-20 13:13:40 +05:30
a94d20d1e4 integration: test STM apply on concurrent deletion 2017-01-19 22:59:01 -08:00
f80914fba2 embed/etcd.go: make v2 endpoint optional. fixes #7100 2017-01-20 11:49:52 +05:30
acec15ebc6 clientv3/concurrency: fix rev comparison on concurrent key deletion 2017-01-19 20:51:31 -08:00
94eec5d41a Merge pull request #7193 from gyuho/manual
Documentation: fix typo s/endpoint-health/endpoint health/
2017-01-19 20:47:17 -08:00
1cd3fd81c8 Merge pull request #7194 from gyuho/backport
NEWS: fix date for v3.1 release
2017-01-19 17:56:14 -08:00
8ad5e29447 NEWS: fix date for v3.1 release 2017-01-19 16:59:02 -08:00
be28981234 Documentation: fix typo s/endpoint-health/endpoint health/ 2017-01-19 16:52:26 -08:00
4fad94246d Merge pull request #7190 from gyuho/docs
Documentation: update experimental_apis for v3.1 release
2017-01-19 12:41:58 -08:00
8a779ce709 Merge pull request #7144 from tobilarscheid/enhancement/highlight-differences-in-docker-commands
Highlight differences between example run commands in docker_guide
2017-01-19 12:29:52 -08:00
81c1288d60 Documentation: update experimental_apis for v3.1 release 2017-01-19 11:32:06 -08:00
5cf0d6678b Merge pull request #7174 from vimalk78/support-v3-txn-without-condition
clientv3/txn.go : removed the TODO: add a Do for shortcut the txn without any condition
2017-01-19 08:45:34 -08:00
d03d7f0c0d Merge pull request #7188 from jl2005/dir-expire
store: set e.Node.Dir attribute, when node expired
2017-01-19 08:33:37 -08:00
99639186cd store: set Dir attribute, when node expired 2017-01-19 18:00:56 +08:00
eb88a5f288 Polish note about varying parameters for each member 2017-01-18 14:55:02 -08:00
b2d5c91f0e Merge pull request #7186 from gyuho/vendor
*: update 'golang.org/x/net/context' and its dependencies
2017-01-18 13:09:57 -08:00
90509008e9 Merge pull request #7159 from heyitsanthony/proxy-stop-cache
grpcproxy, etcdmain, integration: add close channel to kv proxy
2017-01-18 13:04:31 -08:00
8c0282ab24 grpcproxy, etcdmain, integration: add close channel to kv proxy
ccache launches goroutines that need to be explicitly stopped.

Fixes #7158
2017-01-18 11:51:16 -08:00
85e14a841a vendor: update 'golang.org/x/net' 2017-01-18 10:29:49 -08:00
933bcac6da glide: update 'golang.org/x/net' 2017-01-18 10:29:43 -08:00
293c75b133 test: passed the test script arguments as the test function parameters 2017-01-18 21:28:57 +05:30
fcaa509e4c clientv3/txn.go : removed the TODO: add a Do for shortcut the txn without any condition 2017-01-18 11:37:29 +05:30
1a962df596 Merge pull request #7176 from heyitsanthony/bump-lread-timeout
etcdserver: use ReqTimeout for linearized read
2017-01-17 16:08:50 -08:00
5c774ff571 etcdserver: use ReqTimeout for linearized read
Fixes #7136
2017-01-17 14:55:39 -08:00
307e14028c Merge pull request #7175 from gyuho/report
pkg/report: add nil checking for getTimeSeries
2017-01-17 13:22:20 -08:00
462dbfe10d Merge pull request #7172 from gyuho/upgrade-doc
Documentation: document upgrading to v3.1
2017-01-17 13:21:37 -08:00
abf7847fa5 Documentation: document upgrading to v3.1 2017-01-17 13:20:24 -08:00
69606bb95f pkg/report: add nil checking for getTimeSeries 2017-01-17 12:51:47 -08:00
3a40421aa5 Merge pull request #7157 from fanminshi/clientv3_balancer_uses_one_connection
clientv3: balancer uses one connection at a time
2017-01-17 12:12:35 -08:00
2db9d3b702 Merge pull request #6440 from lclarkmichalek/how-to-ssl-question-mark
Obey the usual rules of SSL server name verification when using a private PKI
2017-01-17 10:28:22 -08:00
bad2f03cd0 Merge pull request #7173 from gyuho/manual
ctlv3: print cluster info after adding new member
2017-01-17 10:15:39 -08:00
df55438a60 clientv3: balancer uses one connection at a time
FIX #7080
2017-01-17 10:09:41 -08:00
b8e9bd2b42 ctlv3: print cluster info after adding new member 2017-01-17 09:52:38 -08:00
eba41cd7b3 pkg/transport: Obey the usual laws of ssl when using a private PKI 2017-01-15 21:27:53 +00:00
017ea3df50 Merge pull request #7164 from elimisteve/patch-1
clientv3: Fixed []byte to string conversion syntax in KV comment
2017-01-15 13:12:01 -08:00
eb7a804ca8 kv.go: Fixed []byte to string conversion syntax in comment 2017-01-15 05:57:16 -08:00
b9d3bd8d42 Merge pull request #7163 from gyuho/snapshot-count
etcd-tester: use 10K for '--snapshot-count'
2017-01-14 17:52:09 -08:00
6f9a20803c etcd-tester: use 10K for '--snapshot-count'
Since we want to send snapshot more often in failure injected cluster
2017-01-14 17:29:35 -08:00
699b1e5b3a Merge pull request #7160 from xiang90/snapshotcount
etcdserver: increase snapshot to 100,000
2017-01-14 16:53:44 -08:00
26d99269c0 Merge pull request #6898 from mitake/auth-maintain
RFC, WIP: etcdserver: let maintenance services require root role
2017-01-14 11:22:14 -08:00
783eaf9de6 e2e: add cases for defrag and snapshot with authentication 2017-01-14 19:36:24 +09:00
9886e9448e auth, etcdserver: let maintenance services require root role
This commit lets maintenance services require root privilege. It also
moves AuthInfoFromCtx() from etcdserver to auth pkg for cleaning purpose.
2017-01-14 19:36:24 +09:00
c5a9d54835 etcdserver: increase snapshot to 100,000
Keep more wal entries in memory for fast follower recovery.
10,000 was a too small number that triggers quite a few snapshots.
ZK proves that 100,000 is a reasonable number for even old less prowerful
machines.

Eventually we should provide both count and max memory (for large entries).
2017-01-13 18:05:25 -08:00
118fd18eb6 Merge pull request #6894 from gyuho/preserve-value
*: 'ignore_value' to detach lease with PutRequest
2017-01-13 16:02:19 -08:00
0f8060bede grpcproxy: handle 'IgnoreValue' field in PutRequest 2017-01-13 15:13:18 -08:00
5dffa38fb2 e2e: test put command with '--ignore-value' flag 2017-01-13 15:13:18 -08:00
e03850c4ac ctlv3: add '--ignore-value' flag to 'put' command 2017-01-13 15:13:18 -08:00
d94d22122b clientv3: add 'WithIgnoreValue' option 2017-01-13 15:13:18 -08:00
a66f133209 integration: test Put,Txn with ignore_value flag 2017-01-13 15:13:18 -08:00
8752ee52a5 etcdserver: use prev-value for ignore_value writes 2017-01-13 15:13:18 -08:00
e655420d33 v3rpc: error for non-empty value with ignore_value 2017-01-13 15:13:18 -08:00
7f8b5774a4 *: regenerate proto files with 'ignore_value' 2017-01-13 15:13:18 -08:00
8eea93942d *: 'ignore_value' to detach lease with PutRequest 2017-01-13 15:13:18 -08:00
4730bddea7 Merge pull request #7153 from gyuho/cap
etcdserver/api, rafthttp: add version v3.2
2017-01-13 14:59:14 -08:00
fa9a78450c rafthttp: add 3.2.0 stream type 2017-01-13 14:23:15 -08:00
ea94aea136 etcdserver/api: add 3.2 in capability 2017-01-13 14:00:03 -08:00
a8cc11375f version: bump to v3.2.0+git 2017-01-13 12:58:15 -08:00
0c88795a19 Merge pull request #7151 from gyuho/travis
travis: use Go 1.7.4, drop old env var
2017-01-13 12:55:32 -08:00
21e3418553 travis: use Go 1.7.4, drop old env var
We don't use Go 1.5.x anymore
2017-01-13 11:34:05 -08:00
bb797c1ee9 Merge pull request #7147 from gyuho/pkg/report
pkg/report: add 'Stats' to expose report raw data
2017-01-13 11:17:57 -08:00
304606ab0b Merge pull request #7139 from heyitsanthony/proxy-rlock
grpcproxy/cache: acquire read lock on Get instead of write lock
2017-01-13 11:15:13 -08:00
74bad576ed pkg/report: add 'Stats' to expose report raw data 2017-01-13 10:26:00 -08:00
7dfe503f1c Merge pull request #7148 from heyitsanthony/fix-lease-overlap
clientv3: don't reset stream on keepaliveonce or revoke failure
2017-01-13 10:05:02 -08:00
af51f87ad2 vendor: remove groupcache, add ccache 2017-01-13 10:02:04 -08:00
9fa6c95054 grpcproxy: use ccache for key cache
groupcache needs a write lock and has no way to expire keys; ccache can
do this, though.

Also removes the key count metric, since there's no way to efficiently
calculate it using ccache.
2017-01-13 10:00:57 -08:00
5e3b20e70c clientv3: don't reset stream on keepaliveonce or revoke failure
Would cause the keepalive loop to cancel out.

Fixes #7082
2017-01-13 09:05:23 -08:00
3d97da0672 improve documentation regarding docker cluster
instead of trying to highlight stuff within markdown code blocks, this commits adds a descriptive sentence explaining the differences.
2017-01-13 09:20:37 +01:00
c89eae790d Merge pull request #7110 from mitake/reauth
etcdserver, clientv3: handle a case of expired auth token
2017-01-13 11:57:25 +09:00
432bda4dec Merge pull request #7146 from fanminshi/clientv3_balancer_uses_one_connection
clientv3: fix balancer test logic
2017-01-12 13:51:03 -08:00
6d443ba3f9 clienv3: fix balancer test logic 2017-01-12 13:07:44 -08:00
6ce03389c8 Merge pull request #7138 from gyuho/NEWS
NEWS: add v3.1.0, v3.0.16 + minor fixes
2017-01-12 11:33:13 -08:00
34136a69c8 Merge pull request #7145 from heyitsanthony/warn-ca-ignore
transport: warn on user-provided CA
2017-01-12 11:14:28 -08:00
c23d666328 NEWS: add v3.1.0, v3.0.16 + minor fixes 2017-01-12 11:07:27 -08:00
da8fd18d8e transport: warn on user-provided CA
ServerName is ignored for a user-provided CA for backwards compatibility. This
breaks PKI, so warn it is deprecated.
2017-01-12 09:10:05 -08:00
c624caabb1 improve example run commands in docker_guide
When bootstrapping a cluster, the docker run command is mostly the same for all cluster member. This commits highlight the small variations between the commands to make them stand out.
2017-01-12 09:16:21 +01:00
824277cb3a Merge pull request #7119 from sinsharat/add_load_test_tool
tools: Add etcd 3.0 load test tool refernece
2017-01-11 22:17:57 -08:00
c512839382 tools: Add etcd 3.0 load test tool refernece 2017-01-12 11:35:32 +05:30
d431b64d97 etcdserver, clientv3: handle a case of expired auth token
This commit adds a mechanism of handling a case of expired auth token
to clientv3. If a server returns an error code
grpc.codes.Unauthenticated, newRetryWrapper() tries to get a new token
and use it as an option of PerRPCCredential.

Fixes https://github.com/coreos/etcd/issues/7012
2017-01-12 11:49:02 +09:00
0df543dbb3 Merge pull request #7141 from heyitsanthony/rate-limit-range
benchmark: option to rate limit range benchmark
2017-01-11 15:44:33 -08:00
6e730af65a benchmark: option to rate limit range benchmark 2017-01-11 14:36:46 -08:00
43dd751c47 Merge pull request #7137 from heyitsanthony/display-docs
documentation: display docs.md in github browser
2017-01-11 11:29:29 -08:00
6f801d2ae8 documentation: display docs.md in github browser 2017-01-11 10:37:42 -08:00
925d1d74ce Merge pull request #7133 from gyuho/bench
pkg/report: support 99.9-percentile, change column name
2017-01-10 18:25:03 -08:00
e44d3abc77 pkg/report: support 99.9-percentile, change column name 2017-01-10 18:22:47 -08:00
88bdd8a5d9 Merge pull request #7120 from sttts/sttts-update-ugorji-2
Update ugorji/go with embedded interface support
2017-01-10 13:11:56 -08:00
f0fa5ec507 Merge pull request #7128 from heyitsanthony/etcdctl-make-rootrole
etcdctl: create root role on auth enable if it does not yet exist
2017-01-10 12:22:02 -08:00
b32a8010a7 Merge pull request #7121 from hhkbp2/add-test-case
raft: add RawNode test case for #6866
2017-01-09 23:37:23 -08:00
522232212d Merge pull request #7127 from heyitsanthony/fix-auth-spin
auth: reject empty user name when checking op permissions
2017-01-09 19:11:18 -08:00
16135165c2 raft: add RawNode test case for #6866 2017-01-10 10:55:57 +08:00
d20f23c795 etcdctl: create root role on auth enable if it does not yet exist
Kind of tedious to add the root role when enabling auth; can just add
it automatically.
2017-01-09 16:18:13 -08:00
c39a59c0be auth: reject empty user name when checking op permissions
Passing AuthInfo{} to permission checking was causing an infinite loop
because it would always return an old revision error.

Fixes #7124
2017-01-09 15:53:36 -08:00
5278ea5ed0 integration: add grpc auth testing 2017-01-09 15:53:36 -08:00
8adfc06084 Merge pull request #7118 from hhkbp2/fix-test-case
raft: fix test cases for #7042
2017-01-09 10:34:46 -08:00
4a245a632a vendor: update ugorji/go 2017-01-09 12:13:50 +01:00
7bb768ba34 raft: fix test case for #7042 2017-01-09 16:52:02 +08:00
f99c76cb47 Merge pull request #7113 from heyitsanthony/testutil-bufsize
testutil: increase size of buffer for stack dump
2017-01-06 18:16:42 -08:00
6ab8dcb679 testutil: increase size of buffer for stack dump
Too many goroutines to fit all stack traces in 8kb.
2017-01-06 17:14:42 -08:00
bc2d47118d Merge pull request #7016 from fanminshi/faq_add_meaning_of_etcd
why: add origin of the term etcd
2017-01-06 16:13:34 -06:00
953b0c6ba2 why: add origin of the term etcd
explain the meaning behind the term etcd.
2017-01-06 16:12:20 -06:00
628e83ecc7 Merge pull request #7106 from gyuho/go1.8
integration: use only digits in unix ports
2017-01-06 13:04:35 -08:00
998f8bf291 Merge pull request #7112 from heyitsanthony/expect-debug
expect: EXPECT_DEBUG environment variable
2017-01-06 11:52:26 -08:00
af5b8190d2 Merge pull request #7111 from heyitsanthony/e2e-ctl-trace
e2e: dump stacks on ctlTest timeout
2017-01-06 11:28:56 -08:00
cf382dbe60 expect: EXPECT_DEBUG environment variable
Dump process output to stdout when EXPECT_DEBUG != "".
2017-01-06 11:09:06 -08:00
acfa601075 e2e: dump stack on ctlTest timeout
Figure out which process is blocking for Elect/Lock test timeouts.
2017-01-06 02:03:55 -08:00
6825ffe1a4 integration: use only digits in unix ports
Fix https://github.com/coreos/etcd/issues/6959.
2017-01-05 12:34:54 -08:00
a42b399f4e Merge pull request #7094 from heyitsanthony/fix-duplicate-grant
auth: use quorum get for GetUser/GetRole for mutable operations
2017-01-05 11:28:33 -08:00
5feb4e1027 Merge pull request #7103 from heyitsanthony/proxy-watch-close
grpcproxy: tear down watch when client context is done
2017-01-04 19:04:08 -08:00
fd72ecfe92 Merge pull request #7087 from sinsharat/make_etcd-runner_command_compliant
etcd-runner: make command compliant
2017-01-04 16:33:19 -08:00
e179225f28 grpcproxy: tear down watch when client context is done
If client closes but all watch streams are not canceled, the outstanding
watch will wait until it is canceled, causing watch server to potentially
wait forever to close.

Fixes #7102
2017-01-04 16:23:27 -08:00
154f268031 Merge pull request #7001 from heyitsanthony/etcdctl-doc
etcdctl: tighten up output, reorganize README.md
2017-01-04 13:44:49 -08:00
10d3b81c39 Merge pull request #7093 from gyuho/member
etcdserver: expose ErrMemberNotEnoughStarted
2017-01-04 12:09:29 -08:00
f9f691ef1f auth: use quorum get for GetUser/GetRole for mutable operations
GetUser would not propagate to the minority node, causing TestCtlV2GetRoleUser to
run CreateUser instead of UpdateUser. Instead, use quorum get to fetch the
current state of auth.

Fixes #7069
2017-01-04 11:55:07 -08:00
729dcd51ce Merge pull request #7090 from vimalk78/fix-comactor-resume-leadr-change#7040
etcdserver: resume compactor only if leader
2017-01-04 10:47:44 -08:00
559a82f66e Merge pull request #7097 from heyitsanthony/benchmark-verbose
benchmark: enable grpc error logging on stderr
2017-01-04 10:32:07 -08:00
40ae83beab Merge pull request #7099 from overvenus/patch-1
docs: fix recovery example in recovery.md
2017-01-04 10:16:48 -08:00
37501e2a5d Merge pull request #7092 from xiang90/fix_raft
raft: use status to test node stop
2017-01-04 09:13:11 -08:00
7aeddf6cd7 docs: fix recovery example in recovery.md 2017-01-04 19:41:15 +08:00
d0f301adb7 etcd-runner:add flags in watcher for hardcoded values 2017-01-04 15:17:53 +05:30
b8444d4d35 benchmark: enable grpc error logging on stderr
Lets you see connection errors (e.g., if tls is misconfigured)
2017-01-04 00:26:43 -08:00
5fac6b8d15 etcdserver: resume compactor only if leader 2017-01-04 05:01:14 +05:30
2b5f9e1c6b etcdserver: expose ErrNotEnoughStartedMembers
Fix https://github.com/coreos/etcd/issues/7072.
2017-01-03 15:23:06 -08:00
fc8cd44c72 raft: use status to test node stop
n.Tick() is async. It can be racy when running with n.Stop().

n.Status() is sync and  has a feedback mechnism internally. So there wont be
any race between n.Status() and n.Stop() call.
2017-01-03 15:18:48 -08:00
61064a7be3 Merge pull request #7085 from gyuho/raft-example-snapshot
raftexample: load snapshot when opening WAL
2017-01-03 10:34:13 -08:00
5cb6dd268b etcd-runner: make command compliant 2017-01-03 14:43:58 +05:30
0af1679b61 raftexample: load snapshot when opening WAL
Fix https://github.com/coreos/etcd/issues/7056.
Previously we don't load snapshot when replaying WAL.
2016-12-30 17:28:57 -08:00
24601ca24b Merge pull request #7084 from heyitsanthony/watch-proxy-leak
integration: wait for watch proxy to finish on client close
2016-12-30 12:51:31 -08:00
75441390b6 integration: defer clus.Terminate in watch tests
Common pattern was defer cancel(), but clus.Terminate() at the end of
the test. This appears to lead to a deadlock that is only released
once the context times out, causing inflated test times.
2016-12-30 12:34:04 -08:00
9b5eb1ae5a grpcproxy, etcdmain, integration: return done channel with WatchServer
Makes it possible to synchronously close the watch server.

Fixes #7078
2016-12-30 12:09:48 -08:00
29e14dde0c Merge pull request #7081 from gyuho/timeout-rafthttp
rafthttp: bump up timeout in pipeline test
2016-12-30 10:14:12 -08:00
cbb6ede69d Merge pull request #7067 from fanminshi/rework_coverage_unit_integration
coverage: rework coverage for unit and integration tests
2016-12-30 10:13:07 -08:00
d25f9feb19 rafthttp: bump up timeout in pipeline test
Fix https://github.com/coreos/etcd/issues/6283.

The timeout is too short. It could take more than 10ms
to send when the buffer gets full after 'pipelineBufSize' of
requests.
2016-12-30 09:46:16 -08:00
74e7614759 testutil: whitelist thread created by go cover 2016-12-29 17:19:27 -08:00
d9a3472894 coverage: rework code coverage for unit and integration tests 2016-12-29 17:19:03 -08:00
0dce29ae57 Merge pull request #7077 from fanminshi/consistent_naming
etcdserver: consistent naming in raftReadyHandler
2016-12-29 14:37:46 -08:00
8242049a33 Merge pull request #7076 from fanminshi/fix_e2e_test
e2e: unset ETCDCTL_API env var before running e2e tests
2016-12-29 14:37:25 -08:00
734dd75565 Merge pull request #7075 from gyuho/version-pull
e2e: poll '/version' in release upgrade tests
2016-12-29 11:29:45 -08:00
2a1bae0c2a etcdserver: consistent naming in raftReadyHandler 2016-12-29 11:27:16 -08:00
b741452d03 e2e: unset ETCDCTL_API env var before running u2e tests
existing ETCDCTL_API env var causes e2e to fail some of its tests.  ETCDCTL_API should not be set before e2e tests start.
the tests themselves should set ETCDCTL_API properly.
2016-12-29 11:21:15 -08:00
4e1010c1b9 e2e: poll '/version' in release upgrade tests
Fix https://github.com/coreos/etcd/issues/7065.
2016-12-29 10:52:40 -08:00
67c75606db Merge pull request #7070 from heyitsanthony/fix-lease-race
lease: use atomics for accessing lease expiry
2016-12-28 16:30:08 -08:00
b5cde6b321 lease: use atomics for accessing lease expiry
Demote was racing on expiry when LeaseTimeToLive called Remaining. Replace
with intrinsics since the ordering isn't important, but torn writes are
bad.
2016-12-28 15:44:14 -08:00
1643ed5667 Merge pull request #7071 from heyitsanthony/bump-integration-timeout
test: bump grpcproxy pass timeout to 15m
2016-12-28 15:41:00 -08:00
f876ccb055 test: bump grpcproxy pass timeout to 15m
integration tests have a 15m timeout elsewhere. The lease stress tests
seem to have pushed the running time over 10m on proxy CI, causing
failures from timeout.
2016-12-28 14:56:57 -08:00
12d930b40f Merge pull request #7068 from heyitsanthony/fix-v2-health
v2http: submit QGET in health endpoint if no progress
2016-12-28 14:30:31 -08:00
3519a9784e Merge pull request #7039 from mitake/benchmark-dialtimeout
benchmark: a new option for configuring dial timeout
2016-12-28 13:12:11 -08:00
9690220cd1 Merge pull request #7064 from heyitsanthony/fix-health-perms
etcdctl: treat permission denied as healthy endpoint
2016-12-28 13:04:55 -08:00
e2463569e7 v2http: submit QGET in health endpoint if no progress
Removing the periodic SYNC calls broke the health endpoint since the
raft index stops updating. Instead, don't bother monitoring the
raft index; issue a QGET directly to get a consensus response.

Fixes #6985
2016-12-28 12:20:56 -08:00
46062efa78 e2e: test cluster-health 2016-12-28 12:20:55 -08:00
e63059ec31 Merge pull request #7030 from crandles/grpc-histograms
etcdmain: add '--metrics' option
2016-12-28 12:03:53 -08:00
36b2d3f5eb etcdmain: add --metrics flag for exposing histogram metrics
this adds a new flag, --metrics, that can be used to enable extensive (histogram) metrics.

Fixes #7024
2016-12-28 13:04:52 -05:00
00e00f16bb ctlv3: consider permission denied error to be healthy for endpoints
Relaxes the permission expectations for endpoint health by noting:
* permission denial on linearized reads is always through consensus
* endpoint health means consensus with the cluster through the endpoint

So, there's no need to require permission on a health check key in order
to know whether the endpoint is healthy.

Fixes #7057
2016-12-28 09:13:27 -08:00
b940e0d514 Merge pull request #7042 from petermattis/pmattis/resume-after-heartbeat-resp
raft: resume paused followers on receipt of MsgHeartbeatResp
2016-12-27 21:15:53 -08:00
a662ddefbb benchmark: a new option for configuring dial timeout
Current benchmark doesn't have an option for configuring dial timeout
of gRPC. This commit adds --dial-timeout for the purpose. It is useful
for stopping long sticking benchmarks.
2016-12-28 14:07:43 +09:00
407afc69ed e2e: check etcdctl endpoint health is healthy if denied permission to key 2016-12-27 14:49:52 -08:00
c00084812c Merge pull request #7054 from gyuho/err
etcd-tester: remove unused err var from maxRev
2016-12-27 12:36:48 -08:00
db8b15bf8f etcd-tester: remove unused err var from maxRev 2016-12-27 12:16:43 -08:00
89b18ff1af Merge pull request #7015 from fanminshi/fix_lease_expired_too_soon
lease: force leader to apply its pending committed index for lease op…
2016-12-27 11:26:15 -08:00
2faf72f47c etcdserver: rework update committed index logic 2016-12-27 10:11:40 -08:00
17873f7be8 Merge pull request #7008 from heyitsanthony/fix-dns
retry on resolution failure for advertised peer DNS check
2016-12-27 10:03:01 -08:00
d9b9821551 Merge pull request #7060 from hhkbp2/fix-pre-vote-tests
raft: fix pre-vote tests
2016-12-26 17:42:36 -08:00
920b155f17 raft: fix pre-vote tests 2016-12-26 14:31:59 +08:00
7b7feb46fc leasehttp: buffer error channel to prevent goroutine leak 2016-12-22 14:25:01 -08:00
fef4a79528 lease: force leader to apply its pending committed index for lease operations
suppose a lease granting request from a follower goes through and followed by a lease look up or renewal, the leader might not apply the lease grant request locally. So the leader might not find the lease from the lease look up or renewal request which will result lease not found error. To fix this issue, we force the leader to apply its pending commited index before looking up lease.

FIX #6978
2016-12-22 14:24:38 -08:00
1a8e3cad9a Merge pull request #7053 from gyuho/typo
etcd-tester: fix typo, add endpoint in logs
2016-12-22 13:12:38 -08:00
591bb5e7f6 etcd-tester: fix typo, add endpoint in logs 2016-12-22 12:51:27 -08:00
acbf0fa452 Merge pull request #7041 from m1093782566/raft-safe
raft: make memory storage set method thread safe
2016-12-20 09:14:27 -08:00
e625400f1d raft: resume paused followers on receipt of MsgHeartbeatResp
Previously, paused followers were resumed upon sending a MsgHearbeat.

Fixes #7037
2016-12-20 08:22:09 -05:00
8151d4d0bc raft: make memory storage set method thread safe 2016-12-20 18:48:52 +08:00
d62ce55584 Merge pull request #7027 from gyuho/default-host
embed: only override default advertised client URL if the client listen URL is 0.0.0.0
2016-12-16 18:53:11 -08:00
e58287f026 embed: only override default advertised client URL if the client listen URL is 0.0.0.0 2016-12-16 18:31:04 -08:00
af3451be26 Merge pull request #7018 from gyuho/why
Documentation: add 'why.md'
2016-12-16 15:54:49 -08:00
bef87cc953 Documentation: add 'why.md' 2016-12-16 15:54:03 -08:00
f95f7a3027 Merge pull request #7028 from gyuho/faq
Documentation: add FAQs on membership operation
2016-12-16 15:37:21 -08:00
2f0e82a31e Documentation: add FAQs on membership operation
Copy Anthony's answer from:
https://github.com/coreos/etcd/issues/6103
https://github.com/coreos/etcd/issues/6114
2016-12-16 15:13:40 -08:00
780d2f2a59 etcdctl: tighten up output, reorganize README.md
Documentation was far too repetitive, making it a chore to read and
make changes. All commands are now organized by functionality and all
repetitive bits about return values and output are in a generalized
subsections.

etcdctl's output handling was missing a lot of commands. Similarly,
in many cases an output format could be given but fail to report
an error as expected.
2016-12-16 13:54:20 -08:00
531c3061c1 Merge pull request #7023 from heyitsanthony/lease-freeze
clientv3: fix lease "freezing" on unhealthy cluster
2016-12-16 11:38:22 -08:00
a375e91c66 clientv3: don't reset keepalive stream on grant failure
Was triggering cancelation errors on outstanding KeepAlives if Grant
had to retry.
2016-12-16 10:36:51 -08:00
46bd842db9 clientv3/integration: test lease grant/keepalive with/without failures 2016-12-16 10:36:51 -08:00
87b1d9571f v3api, rpctypes: add ErrTimeoutDueToConnectionLost
Lack of GRPC code was causing this to look like a halting error to the client.
2016-12-16 10:25:35 -08:00
d9e928de7a Merge pull request #7020 from heyitsanthony/etcdctl-migrate-warn
etcdctl: warn when backend takes too long to open on migrate
2016-12-16 09:51:34 -08:00
109577351b Merge pull request #7022 from hongchaodeng/master
docs: explicitly set ETCDCTL_API=3 in recovery.md
2016-12-15 20:39:19 -08:00
fa733e1e9c docs: explicitly set ETCDCTL_API=3 in recovery.md 2016-12-15 20:10:30 -08:00
e71ff361a4 etcdctl: warn when backend takes too long to open on migrate 2016-12-15 18:57:57 -08:00
52e3dc5eb9 Documentation: minor fix nodes -> node 2016-12-15 21:27:52 -05:00
93e303ec71 Merge pull request #7017 from gyuho/faq
dev-guide: add limit.md
2016-12-15 15:45:23 -08:00
a1e572b460 dev-guide: add limit.md 2016-12-15 15:44:21 -08:00
5aeee917a7 Merge pull request #7006 from heyitsanthony/clusterid-split
Documentation: FAQ entry for cluster ID mismatches
2016-12-15 12:43:17 -08:00
14c851c863 Documentation: FAQ entry for cluster ID mismatches 2016-12-15 11:27:24 -08:00
86a43849fb Merge pull request #7010 from dennwc/keepalive-exit-err
clientv3: ensure KeepAlive channel is closed or error is returned
2016-12-15 08:06:36 -08:00
35fd5dc9fc Merge pull request #6903 from mitake/auth-member
protect membership change RPCs with auth
2016-12-15 08:04:31 -08:00
b126e31132 clientv3: better error message for keep alive loop halt 2016-12-15 16:06:27 +02:00
d46b753186 e2e: test cases of protecting membership change with auth 2016-12-15 22:54:20 +09:00
86d7390804 auth, etcdserver: protect membership change operations with auth
This commit protects membership change operations with auth. Only
users that have root role can issue the operations.

Implements https://github.com/coreos/etcd/issues/6899
2016-12-15 22:54:20 +09:00
5183ce0118 clientv3: add test for keep alive loop exit case 2016-12-15 03:02:44 +02:00
e0bcd4d516 clientv3: return error from KeepAlive if corresponding loop exits
after recvKeepAliveLoop exits client might call KeepAlive adding request channel that will not be closed
this fix makes sure that recvKeepAliveLoop is running before adding request to lessor's list and returns error otherwise

Fixes #6922
2016-12-15 03:02:35 +02:00
d8513adf1d Merge pull request #7007 from heyitsanthony/lease-close
clientv3: close Lease on client Close
2016-12-14 16:06:32 -08:00
26a3e9a740 membership: retry for 30s on advertise url check 2016-12-14 15:56:22 -08:00
29c30b2387 etcdserver: retry for 30s on advertise url check 2016-12-14 15:56:22 -08:00
13b05aeff8 netutil: ctx-ize URLStringsEqual
Handles the case where the DNS entry will only be set up after etcd
starts.
2016-12-14 15:46:30 -08:00
246fb29d8a clientv3: close Lease on client Close
Fixes #6987
2016-12-14 12:11:17 -08:00
a9f72ee0d4 Merge pull request #7005 from heyitsanthony/fix-pprof
embed: deep copy user handlers
2016-12-14 12:05:37 -08:00
8f88632218 Merge pull request #6965 from gyuho/faq
Documentation: add more FAQs (follower, leader, sys-require)
2016-12-14 11:51:34 -08:00
626df4d77c Documentation: add more FAQs (follower, leader, sys-require) 2016-12-14 11:36:07 -08:00
cc931a2319 embed: deep copy user handlers
Shallow copy of user handlers leads to a nil map assignment when
enabling pprof. Since the map is being modified, it should probably
be deep copied into the server context, which fixes the crash.
2016-12-14 10:17:32 -08:00
4ca78aa89f Merge pull request #7004 from fbarbeira/patch-3
op-guide/clustering: fix typo
2016-12-14 09:52:37 -08:00
972ef3c92e op-guide/clustering: fix typo 2016-12-14 18:51:30 +01:00
1e60f88786 Merge pull request #6999 from leonliao/patch-1
Documentation: use port 2379 in local cluster guide
2016-12-14 09:29:20 -08:00
cb9277f339 Documentation: use port 2379 in local cluster guide
The port in endpoints should be 2379, instead of 12379.
2016-12-14 15:09:21 +08:00
cdde0368ad Merge pull request #6997 from gyuho/range
auth: improve 'removeSubsetRangePerms' to O(n)
2016-12-13 16:14:37 -08:00
a53175949e auth: improve 'removeSubsetRangePerms' to O(n) 2016-12-13 15:43:23 -08:00
454f1da2f2 Merge pull request #6996 from xiang90/hardware
doc: add hardware section
2016-12-13 12:53:52 -08:00
e3d8ef4cea doc: add hardware section 2016-12-13 12:42:47 -08:00
1a8e78cd55 Merge pull request #6994 from gyuho/etcd-tester-fix-leak
etcd-tester: cancel lease stream; fix OOM panic
2016-12-13 10:44:54 -08:00
301abddc72 etcd-tester: cancel lease stream; fix OOM panic
It was never closing lease keep-alive streams, leaking memory.
Fix OOM panics in etcd-tester (after 1K rounds).
2016-12-13 09:56:30 -08:00
cc37beff35 Merge pull request #6995 from gyuho/etcd-tester-pprof
etcd-tester: add 'enable-pprof' option
2016-12-13 08:02:47 -08:00
4e831810c9 Merge pull request #6993 from cloudaice/build-bug
build: remove dir use -r flag
2016-12-13 07:57:31 -08:00
7d16e7d27e etcd-tester: add 'enable-pprof' option 2016-12-13 05:03:27 -08:00
b294ab13a4 build: remove dir use -r flag 2016-12-13 16:08:50 +08:00
797d826117 Merge pull request #6979 from heyitsanthony/fields-fmt
etcdctl: "fields" output formats
2016-12-12 15:17:12 -08:00
5f3140987e etcdctl: "fields" output formats
Writes out fields from responses in the format "FieldName" : FieldValue. If
FieldValue is a string, it is formatted with %q.
2016-12-12 13:21:20 -08:00
be740dc436 Merge pull request #6975 from gyuho/gopath
*: fix 'gosimple', 'gounused' checks
2016-12-12 11:53:45 -08:00
5b7582365e Merge pull request #6990 from xiang90/faq_m
doc: add faq about missing heartbeat
2016-12-12 11:38:10 -08:00
468187de31 doc: add faq about missing heartbeat 2016-12-12 11:31:17 -08:00
7e74b3f846 grpcproxy: remove unused field 'wbs *watchBroadcasts' 2016-12-12 10:07:14 -08:00
eb8646a381 v3rpc: remove unused 'splitMethodName' function 2016-12-12 10:07:14 -08:00
3512f114e4 e2e: remove unused 'ctlV3GetFailPerm' 2016-12-12 10:07:14 -08:00
b8e09bf849 tools: simplify boolean comparison, remove unused 2016-12-12 10:07:14 -08:00
0c5d1d5641 raft: simplify boolean comparison, remove unused 2016-12-12 10:07:14 -08:00
f3cb93015c integration: simplify boolean comparison in resp.Created 2016-12-12 10:07:14 -08:00
55307d48ac auth: fix gosimple errors 2016-12-12 10:07:14 -08:00
6ec4b9c26a test: exclude '_home' for gosimple, unused 2016-12-12 10:07:14 -08:00
0a15c1b9c6 Merge pull request #6988 from xiang90/faq_m
doc: add faq about apply warning logging
2016-12-12 10:06:32 -08:00
6969369a32 doc: add faq about apply warning logging 2016-12-12 09:58:42 -08:00
20dca1eb80 Merge pull request #6977 from heyitsanthony/move-prof
etcdserver, embed, v2http: move pprof setup to embed
2016-12-09 13:30:19 -08:00
cf60588b27 Merge pull request #6974 from heyitsanthony/interacting-fixup
Documentation: update get examples to be clearer about ranges
2016-12-09 13:08:40 -08:00
2c06def8ca etcdserver, embed, v2http: move pprof setup to embed
Seems like a better place for prof setup since it's not specific to v2.
2016-12-09 12:37:35 -08:00
cb75c40a8b Merge pull request #6973 from sinsharat/make_contributing_url_based
github: make contribution link non-relative
2016-12-09 12:28:07 -08:00
46e63cc14a Merge pull request #6972 from heyitsanthony/bug-report-link
github: make bug reporting link non-relative
2016-12-09 11:15:29 -08:00
d2a6bbd9c6 Documentation: update get examples to be clearer about ranges
Fixes #6966
2016-12-09 10:54:38 -08:00
01c8b25284 github: make contribution link non-relative 2016-12-10 00:03:47 +05:30
f8b480cd6f github: make bug reporting link non-relative
Works when accessed through code browser, blank if accessed via issues/
2016-12-09 10:18:38 -08:00
1e92b7929c Merge pull request #6967 from heyitsanthony/glide-versions
vendor: use version tags if possible
2016-12-08 16:09:41 -08:00
de58a9c733 scripts: use glide update if repo exists in glide.lock 2016-12-08 14:26:29 -08:00
f095334788 vendor: use versions when possible in glide.yaml
Now using tags instead of SHAs
2016-12-08 14:26:08 -08:00
367f513674 Merge pull request #6961 from heyitsanthony/roadmap
ROADMAP: update for 3.2
2016-12-08 13:30:31 -08:00
b713113094 Merge pull request #6962 from gyuho/mispell
grpcproxy: fix minor typo
2016-12-07 18:55:09 -08:00
fcbfff6a00 Merge pull request #6958 from xiang90/reduce_sync
etcdserver: only send v2 sync if ttl keys exist
2016-12-07 18:38:02 -08:00
a98de7efa7 grpcproxy: fix minor typo 2016-12-07 17:08:46 -08:00
69cc9fdd17 Merge pull request #6956 from gyuho/faq
Documentation: add more FAQ questions
2016-12-07 16:25:47 -08:00
7c0ae91d78 Documentation: add more FAQ questions 2016-12-07 16:25:04 -08:00
09252c4e07 ROADMAP: update for 3.2 2016-12-07 16:12:59 -08:00
2f96a68a20 etcdserver: do not send v2 sync if ttl keys do not exist 2016-12-07 14:48:15 -08:00
da3b71b531 Merge pull request #6929 from heyitsanthony/ctx-lease-renew
etcdserver: use context for Renew
2016-12-07 00:05:14 -08:00
96626d0a23 Merge pull request #6957 from coreos/philips-patch-1
Documentation: add blox and chain as users
2016-12-06 20:23:27 -08:00
1bee237acf Documentation: add blox and chain as users 2016-12-06 20:20:40 -08:00
c4e5081562 Merge pull request #6943 from m1093782566/fix-store-test-comments
store: fix store_test.go comments
2016-12-06 16:54:36 -08:00
529806dba1 Merge pull request #6935 from bdarnell/election-test
raft: Fix election "logs converge" test
2016-12-06 16:45:39 -08:00
be1f36d97c v3rpc, etcdserver, leasehttp: ctxize Renew with request timeout
Would retry a few times before returning a not primary error that
the client should never see. Instead, use proper timeouts and
then return a request timeout error on failure.

Fixes #6922
2016-12-06 14:09:57 -08:00
f6042890b7 integration: use RequireLeader for TestV3LeaseFailover
Giving Renew() the default request timeout causes TestV3LeaseFailover
to miss its timing constraints. Since it only needs to wait until the
leader recognizes the leader is lost, use RequireLeader to cancel the
keepalive stream before the request times out.
2016-12-06 14:09:57 -08:00
fdd89df1eb clientv3/integration: test lease keepalive works following quorum loss 2016-12-06 14:09:57 -08:00
cfd10b4bbf Merge pull request #6949 from xiang90/faq
doc: initial faq
2016-12-06 10:08:09 -08:00
58150937c0 doc: initial faq 2016-12-06 08:48:57 -08:00
1b0ffdaff0 Merge pull request #6945 from sttts/sttts-update-ugorji
Update ugorji
2016-12-06 08:05:13 -08:00
9c364efef6 client: update generated ugorji codec 2016-12-06 07:53:47 +01:00
b21731c022 vendor: update ugorji/go 2016-12-06 07:53:47 +01:00
9603d5e31f store: fix store_test.go comments 2016-12-06 09:31:59 +08:00
994e0d2182 Merge pull request #6950 from gyuho/fix-readstatec-deadlock
etcdserver: time out when readStateC is blocking
2016-12-05 16:37:47 -08:00
cbee2b74a3 Merge pull request #6948 from heyitsanthony/fix-metric-deadlock
grpcproxy: fix deadlock in watchbroadcast
2016-12-05 16:17:26 -08:00
3fd1d951f8 etcdserver: time out when readStateC is blocking
Otherwise, it will block forever when the server is overloaded.

Fix https://github.com/coreos/etcd/issues/6891.
2016-12-05 15:34:46 -08:00
91ff6f30b5 grpcproxy: fix deadlock in watchbroadcast
Calling empty() in watchbroadcast methods was trying to
lock the rwmutex when it was already held.

Fixes #6937
2016-12-05 15:06:44 -08:00
2509e7ad2c Merge pull request #6947 from heyitsanthony/grpc-stat-race
grpcproxy: lock store when getting size
2016-12-05 14:30:00 -08:00
8fefd1f471 Merge pull request #6942 from eiipii/eiipiiVersion2ScalaClient
eiipii/etcdhttpclient library added to documentation on external clients
2016-12-05 14:12:47 -08:00
f62ed3d642 Documentation: link added to libraries-and-tools.md with a new v2 Scala
Client
2016-12-05 22:55:17 +01:00
b9b14b15d6 Merge pull request #6946 from heyitsanthony/fix-e2e-getrole
etcdctl: remove GetUser check before mutable commands
2016-12-05 13:34:52 -08:00
62398954e4 grpcproxy: lock store when getting size
Fixes data race in proxy integration tests.
2016-12-05 13:29:57 -08:00
5559a026d7 etcdctl: remove GetUser check before mutable commands
etcdctl was checking if the user exists before applying mutable calls;
if etcdctl contacts a minority member, the member may not know the user
exists on the cluster yet, causing command failure when it should succeed.

If the user does not exist, it will be picked up once the command goes
through raft.

Fixes #6932
2016-12-05 12:12:06 -08:00
2b6ad93036 Merge pull request #6936 from xiang90/put_rate
banchmark: add rate limit
2016-12-05 12:01:15 -08:00
e62e9ce193 benchmark: add rate limit 2016-12-05 09:54:30 -08:00
40f0193c4c Merge pull request #6938 from bdarnell/ispaused
raft: Export Progress.IsPaused
2016-12-03 21:51:09 -08:00
f60a5d6025 raft: Export Progress.IsPaused
CockroachDB would like to use this method for monitoring.
2016-12-04 13:14:08 +08:00
340ba8353c raft: Fix election "logs converge" test
The "logs converge" case in TestLeaderElectionPreVote was incorrectly
passing because some nodes were not actually using the preVoteConfig.
This test case was more complex than its siblings and it was not
verifying what it wanted to verify, so pull it out into a separate test
where everything can be tested more explicitly.

Fixes #6895
2016-12-03 17:29:15 +08:00
d844440ffb Merge pull request #6930 from xiang90/grpc_metrics
grpcproxy: add cache related metrics
2016-12-02 18:30:49 -08:00
0cb680800e grpcproxy: add cache related metrics 2016-12-02 15:29:42 -08:00
1f954dc9f4 Merge pull request #6926 from xiang90/metrics
grpcproxy: add richer metrics for watch
2016-12-02 14:13:43 -08:00
a686c994cd grpcproxy: add richer metrics for watch 2016-12-02 11:13:30 -08:00
f61b4ae5ad Merge pull request #6921 from heyitsanthony/fix-watch-prevkv-test-leak
integration: cancel Watch when TestV3WatchWithPrevKV exits
2016-12-01 15:25:00 -08:00
76bb33781f integration: cancel Watch when TestV3WatchWithPrevKV exits
Missing ctx cancel was causing goroutine leaks for the proxy tests.
2016-12-01 15:08:18 -08:00
9647012cb1 Merge pull request #6920 from endocode/dongsu/sdnotify-go-systemd
vendor: bump go-systemd to v14 to avoid build error
2016-12-01 10:39:40 -08:00
b9e9c9483b Merge pull request #6885 from fanminshi/refractor_lease_checker
etcd-tester: refactor lease checker
2016-12-01 10:11:15 -08:00
5e351956b9 vendor: bump go-systemd to v14 to avoid build error
Bump go-systemd to v14 (48702e0d, 2016-11-14).
Also adjust caller of daemon.SdNotify() to avoid build error, which can
be seen especially when running "go get github.com/coreos/etcd".
2016-12-01 13:26:46 +01:00
5d60482357 Merge pull request #6911 from m1093782566/fix-get-sorted
store: check sorted order in TestStoreGetSorted
2016-11-30 19:49:55 -08:00
4e52b80590 Merge pull request #6916 from heyitsanthony/fix-coalesce-bcast-race
grpcproxy: fix race between coalesce and bcast on nextrev
2016-11-30 19:49:10 -08:00
5f2b5e8b9d store: check sorted order in TestStoreGetSorted 2016-12-01 10:36:23 +08:00
394ab43587 etcd-tester: refactor lease checker
Move few checking logic from lease stresser to lease checker and change connection logic for lease stresser and checker
2016-11-30 17:29:58 -08:00
60908c64a6 grpcproxy: fix race between coalesce and bcast on nextrev
coalesce was locking the target coalesce broadcast object but not the source
broadcast object resulting in a data race on the source's nextrev.
2016-11-30 16:50:29 -08:00
98cd3fddc9 Merge pull request #6907 from heyitsanthony/fix-quota-proxy-failfast
integration: use Range to wait for reboot in quota tests
2016-11-30 16:49:54 -08:00
f1e0525c81 integration: use Range to wait for reboot in quota tests
Proxy client layer ignores call options so Put is always FailFast;
this can lead to connection errors when trying to issue the Put
following restarting the client's target server.
2016-11-30 13:56:30 -08:00
7079bf9a75 Merge pull request #6574 from vimalk78/auth-simpletoken-not-removed#6554
auth/simple_token.go : token not removed when etcdctl session closes …
2016-11-30 11:33:23 -08:00
8eec86f7fb Merge pull request #6888 from fanminshi/use_monotonic_time_for_lease
Use monotonic time in lease
2016-11-29 13:39:01 -08:00
e7f4010cca lease: Use monotonic time in lease
lease uses monotimer to calculate its expiration. In this way, changing system time won't affect in lease expiration.

FIX #6700
2016-11-29 12:31:00 -08:00
cac30beed5 Merge pull request #6906 from heyitsanthony/fix-watchclose-race
grpcproxy: fix race between watch ranges delete() and broadcasts empty()
2016-11-28 16:26:03 -08:00
d680b8b5fb grpcproxy: fix race between watch ranges delete() and broadcasts empty()
Checking empty() wasn't grabbing the broadcasts lock so the race detector
flags it as a data race with coalesce(). Instead, just return the number
of remaining watches following delete() and get rid of empty().
2016-11-28 15:53:41 -08:00
a076510cc1 Merge pull request #6905 from heyitsanthony/client-readme
client: update README about health monitoring
2016-11-28 13:10:57 -08:00
8aa03a5959 Merge pull request #6884 from gyuho/tls
etcdmain: handle TLS in grpc-proxy listener
2016-11-28 12:28:56 -08:00
ad16b63cce client: update README about health monitoring 2016-11-28 12:28:33 -08:00
dfe853ebff auth: add a timeout mechanism to simple token 2016-11-28 17:21:13 +05:30
c31b1ab8d1 Merge pull request #6896 from gyuho/endpoints
clientv3: return copy of endpoints, not pointer
2016-11-23 11:51:48 -08:00
a08103c088 clientv3: return copy of endpoints, not pointer
Fix https://github.com/coreos/etcd/issues/6892.
2016-11-23 11:33:54 -08:00
aea9c6668f Merge pull request #6890 from gyuho/doc
Documentation/op-guide: add notes about 'datasource' in Prometheus
2016-11-22 10:43:30 -08:00
ede51b10f8 op-guide: add notes about Prometheus data source in Grafana 2016-11-22 10:34:41 -08:00
ec5f9bce63 Merge pull request #6886 from fanminshi/fix_dial_grpc
functional-tester: add withBlock() to grpc dial
2016-11-21 11:33:31 -08:00
f7c721b746 Merge pull request #6867 from fanminshi/fix_checking_timeout
etcd-tester: limit max retry backoff delay
2016-11-21 11:20:32 -08:00
2ccba33dd1 functional-tester: add withBlock() to grpc dial
grpc dail withTimeout() only works if withBlock() option is present.
2016-11-21 11:15:12 -08:00
2ac1c4c9ed etcd-tester:limit max retry backoff delay
grpc uses expoential retry if a connection is lost. grpc will sleep base on exponential delay.
if delay is too large, it slows down tester.
2016-11-21 10:58:55 -08:00
ff96769b55 etcdmain: handle TLS in grpc-proxy listener 2016-11-21 10:39:34 -08:00
0326d6fdd3 Merge pull request #6877 from coreos/fix_test
etcd-tester: do not resolve localhost
2016-11-21 09:52:31 -08:00
69470b5e5f Merge pull request #6878 from absolute8511/fix-raftexample-test
raftexample: confState should be saved after apply
2016-11-21 09:51:10 -08:00
d7c98a4695 Merge pull request #6879 from xiang90/raft_test
raft: fix TestNodeProposeAddDuplicateNode
2016-11-20 22:19:44 -08:00
f2eb8560ed raft: fix TestNodeProposeAddDuplicateNode
Only send signal after applying conf change.
Or deadlock might happen if raft node receives
ready without conf change when the test server
is slow.
2016-11-20 21:59:31 -08:00
859142033f Merge pull request #6866 from absolute8511/master
raft: add node should reset the pendingConf state
2016-11-20 21:34:37 -08:00
e6d1ebcc1d raft: use the channel instead of sleep to make test case reliable 2016-11-21 13:30:15 +08:00
bc6f5ad53e raft: fix test case for data race 2016-11-21 10:30:36 +08:00
62bd5477b9 raft: fix test case, should wait config propose applied 2016-11-21 10:10:34 +08:00
16e3ab0f11 raft: test case to check the duplicate add node propose 2016-11-20 16:58:11 +08:00
e8d06d8e4d raftexample: confState should be saved after apply 2016-11-20 16:51:33 +08:00
b1178469be etcd-tester: do not resolve localhost 2016-11-19 18:38:26 -08:00
7e7c7e157e Merge pull request #6873 from heyitsanthony/proxy-v3-watch-canceled-sync
grpcproxy: fix deadlock on watch broadcasts stop
2016-11-18 22:34:35 -08:00
bb4884e957 Merge pull request #6861 from gyuho/grpc-proxy-metrics
etcdmain: add '/metrics' HTTP/1 path to grpc-proxy
2016-11-18 20:03:52 -08:00
a39509ee5b etcdmain: add '/metrics' HTTP/1 path to grpc-proxy 2016-11-18 19:40:06 -08:00
7618fdd1d6 grpcproxy: fix deadlock on watch broadcasts stop
Holding the WatchBroadcasts lock and waiting on donec was
causing a deadlock with the coalesce loop. Was causing
TestV3WatchSyncCancel to hang.
2016-11-18 16:55:26 -08:00
2acf0806fb Merge pull request #6869 from sinsharat/mvcc_remove_unused_restore_method
mvcc: remove unused restore method
2016-11-18 15:52:45 -08:00
c1581732fd Merge pull request #6872 from heyitsanthony/srv-alert
discovery: warn on scheme mismatch
2016-11-18 13:41:34 -08:00
428cb21a3f Merge pull request #6864 from heyitsanthony/watch-doc
Documentation: add grpc gateway watch example
2016-11-18 13:30:16 -08:00
74ae67b835 discovery: warn on scheme mismatch 2016-11-18 13:12:14 -08:00
b7cc698444 version: bump up v3.1.0-rc.1+git 2016-11-18 11:41:29 -08:00
ccf154e706 Documentation: add grpc gateway watch example
Shows how to use watch via grpc gateway.
2016-11-18 11:37:35 -08:00
6d9168a2ec integration: don't expect recv to stop on CloseSend in waitResponse 2016-11-18 11:37:35 -08:00
3d5ba43211 version: bump up v3.1.0-rc.1 2016-11-18 11:16:01 -08:00
7da3019f42 Merge pull request #6862 from gyuho/network-interface
pkg/netutil: get default interface for tc commands
2016-11-18 10:11:59 -08:00
43078d3ced mvcc: remove unused restore method 2016-11-18 23:04:39 +05:30
097cdbd0e4 pkg/netutil: get default interface for tc commands
Fix https://github.com/coreos/etcd/issues/6841.
2016-11-17 22:49:17 -08:00
68b04b7067 Merge pull request #6846 from sinsharat/mvcc_store_restore_timeout_fix
mvcc: store.restore taking too long triggering snapshot cycle fix
2016-11-17 22:06:43 -08:00
456569f45d e2e: add test for v3 watch over grpc gateway 2016-11-17 15:49:58 -08:00
9a20743190 v3rpc: don't close watcher if client closes send
grpc-gateway will CloseSend but still want to receive updates.
2016-11-17 15:33:37 -08:00
4401d88546 raft: add node should reset the pendingConf state
After add node conf proposed twice with the same node id, the pending state is not reset because
the addNode returned without setting the pending state at the second
time and the pending state will always be true unless other conf changed. During this we
can not add any new node because the propose will be ignored since the
pending state is true.
2016-11-17 15:50:13 +08:00
aa2b5aec1b mvcc : Added benchmark for store.resotre 2016-11-17 04:01:15 +05:30
f014cca644 mvcc: TestStoreRestore fix 2016-11-16 16:58:42 +05:30
95fb41a923 mvcc: store.restore taking too long triggering snapshot cycle fix 2016-11-16 16:31:20 +05:30
377f19b003 Merge pull request #6857 from LK4D4/non_block_status
raft: return empty status if node is stopped
2016-11-15 16:44:50 -08:00
7afc490c95 raft: return empty status if node is stopped
If the node is stopped, then Status can hang forever because there is no
event loop to answer. So, just return empty status to avoid deadlocks.

Fix #6855

Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>
2016-11-15 15:45:23 -08:00
e55b8485dd Merge pull request #6856 from heyitsanthony/proxy-lease-fix
grpcproxy: copy range request before storing in cache
2016-11-15 15:43:00 -08:00
1358a9d460 grpcproxy: copy range request before storing in cache
Reused Range requests would have Serialized overwritten with 'true'.

Was failing on TestV3LeaseSwitch.
2016-11-15 14:35:00 -08:00
7c8f13aed7 Merge pull request #6852 from heyitsanthony/fix-proxy-dbarrier
grpcproxy: watch next revision should be start revision when not 0
2016-11-15 09:21:18 -08:00
98a7c642d4 grpcproxy: watch next revision should be start revision when not 0
The create header revision is the current etcd revision. For watches with
rev=0, the next revision is hdr.rev+1. For watches with rev=n, the next
revision should be n.

Fixes TestDoubleBarrier timeouts.
2016-11-14 16:46:02 -08:00
677606da7d Merge pull request #6851 from gyuho/metrics
v3rpc: replace grpc metrics w/ go-grpc-prometheus
2016-11-14 16:09:17 -08:00
7cac755df2 op-guide: update gRPC requests metrics 2016-11-14 15:20:16 -08:00
5e810e30cc v3rpc: replace grpc metrics w/ go-grpc-prometheus
And disable histogram
2016-11-14 15:20:09 -08:00
d073512def Merge pull request #6849 from heyitsanthony/proxy-fix-watch-create
grpcproxy: don't send extra watch create events
2016-11-14 12:13:40 -08:00
90ea3fbadc grpcproxy: do not resend create event after leader loss
Only set CreateNotify if no watch responses have been received.
2016-11-14 10:43:06 -08:00
e40da39143 grpcproxy: only coalesce watchers that have received create response
Current watchers may have nextrev=0; check response count instead.
2016-11-14 09:19:02 -08:00
a2e86c1371 Merge pull request #6842 from heyitsanthony/watch-prevkv
grpcproxy: support prevKV watcher
2016-11-11 16:32:51 -08:00
45bba11f12 Merge pull request #6844 from gyuho/grafana
op-guide: add screenshot to sample Grafana dashboard
2016-11-11 16:30:02 -08:00
625366875d op-guide: add screenshot to sample Grafana dashboard 2016-11-11 16:21:15 -08:00
70fd684843 Merge pull request #6843 from gyuho/docs
Documentation/op-guide: add 'monitoring' guide
2016-11-11 16:08:32 -08:00
6d83590434 Documentation/op-guide: add 'monitoring' guide 2016-11-11 15:22:07 -08:00
6604306398 grpcproxy: support prevKV watcher
Makes all server watchers PrevKV, discards if client watcher is not PrevKV.
2016-11-11 14:22:06 -08:00
3c97e7a475 Merge pull request #6800 from sinsharat/add_benchmark_watch_latency
benchmark: added watch-latency
2016-11-11 12:41:26 -08:00
e5b6324771 benchmark: added watch-latency 2016-11-12 01:08:35 +05:30
b4726a9501 Merge pull request #6822 from heyitsanthony/watch-bcast
grpcproxy: rework watcher organization
2016-11-11 10:50:27 -08:00
5af4de0930 Merge pull request #6840 from gyuho/vendor
*: clean up vendor
2016-11-11 10:15:18 -08:00
395cf7de51 grpcproxy: reject invalid watch ranges 2016-11-11 10:14:35 -08:00
ec459c2185 grpcproxy: rework watcher organization
The single watcher / group watcher distinction limited and
complicated watcher coalescing more than necessary. Reworked:

Each server watcher is represented by a WatchBroadcast, each
client "Watcher" attaches to some WatchBroadcast. WatchBroadcasts
hold all WatchBroadcast instances for a range. WatchRanges holds
all WatchBroadcasts for the proxy.

WatchProxyStreams represent a grpc watch stream between the proxy and
a client. When a client requests a new watcher through its grpc stream,
the ProxyStream will allocate a Watcher and WatchRanges assigns it to
some WatchBroadcast based on its range.

Coalescing is done by WatchBroadcasts when it receives an update
notification from a WatchBroadcast.

Supports leader failure detection so watches on a bad member
can migrate to other members. Coincidentally, Fixes #6303.
2016-11-11 10:14:35 -08:00
4d5a12a248 Merge pull request #6839 from xiang90/ctl_v
etcdctl: etcdctl v3 should print out its API version
2016-11-11 10:00:14 -08:00
4b417da1be Merge pull request #6837 from purpleidea/feat/consturls
embed: Make immutable defaults constant
2016-11-11 09:53:55 -08:00
38ce362629 vendor: clean up, remove unnecessary deps 2016-11-11 09:51:07 -08:00
1f7e88d851 glide: rerun updatedep.sh to clean up 2016-11-11 09:50:46 -08:00
1ef243e436 etcdctl: etcdctl v3 should print out its API version 2016-11-11 09:33:20 -08:00
745cd730a7 embed: Make immutable defaults constant
This changes the two immutable defaults into constants which allows
packages embedding etcd to import them as const! If they are variables,
then you'll fail with "const initializer foo is not a constant".
2016-11-11 07:34:45 -05:00
952eb4fade Merge pull request #6833 from gyuho/news
NEWS: update with v3.0.15
2016-11-10 13:34:16 -08:00
7c0035637d NEWS: update with v3.0.15 2016-11-10 13:29:07 -08:00
ef024049df Merge pull request #6832 from gyuho/vendor
*: update all grpc-related dependencies
2016-11-10 13:28:07 -08:00
b8b72f80f9 *: revendor, update proto files 2016-11-10 12:02:00 -08:00
baa4e4ee56 scripts/genproto: update gogo/protobuf, grpc-gateway 2016-11-10 12:02:00 -08:00
8631f47568 glide: update all grpc-related dependencies 2016-11-10 12:01:45 -08:00
0a8e28524b Merge pull request #6779 from xiang90/watch_clean
etcd-runner: clean up watcher runner
2016-11-10 09:59:08 -08:00
0b78ef8de1 Merge pull request #6831 from xiang90/grpc_proxy_doc
doc: add gRPC proxy start doc
2016-11-10 09:34:38 -08:00
b16c93a885 doc: add gRPC proxy start doc 2016-11-10 09:20:13 -08:00
523a859ad9 etcd-runner: clean up watcher runner 2016-11-10 08:56:19 -08:00
1a25b2ff3e Merge pull request #6781 from gyuho/vendor
scripts/updatedep: work around 'testify/assert', remove 'etcd-top'
2016-11-09 16:17:27 -08:00
55d25f6f4d tools: remove 'etcd-top'
Travis CI breaks because of cgo dependencies on 'etcd-top'.
This can leave outside of project.
2016-11-09 15:59:47 -08:00
5b8300f08b store: type-assert int64 for assert tests 2016-11-09 15:59:47 -08:00
859ac6dfd8 vendor: rerun 'updatedep.sh' script, clean up 2016-11-09 15:59:47 -08:00
0f68810505 glide: remove legacy packages from godep
And remove all legacy packages in glide.yaml on sub-dependency.
They were added when we migrated from godep. glide will handle
it automatically with glide.lock file.
2016-11-09 15:59:47 -08:00
4cf5b76d18 scripts/updatedep: work around 'testify/assert'
'glide vc --no-tests' flag removes 'testify/assert' deps
in v2 client. Until we deprecate v2 tests, just copy the
necessary files as workaround.

And remove '--skip-tests' flags in case we add dependencies
in test files.
2016-11-09 15:59:34 -08:00
ab6b175a2a Merge pull request #6828 from fanminshi/add_not_equal_to_compare
etcdserver, clientv3: add "!=" to txn
2016-11-09 15:27:08 -08:00
c2fd42b556 etcdserver, clientv3: add "!=" to txn
adding != to compare is a requested functionality from a etcd user

FIX #6719
2016-11-09 14:28:36 -08:00
4a1e89150b Merge pull request #6827 from heyitsanthony/proxy-txn-invalidate
grpcproxy: update cache based on txn response
2016-11-09 13:16:48 -08:00
3ed63af51a Merge pull request #6826 from feisan/dev-kvkv
clientv3/naming: support OpOption when adding an endpoint
2016-11-09 13:03:56 -08:00
a4dcceb8aa grpcproxy: update cache based on txn response
Fixes more hangs in TestSTMConflict.
2016-11-09 12:11:38 -08:00
c20d31adc5 clientv3/naming: support OpOption when adding an endpoint
if we want to add an endpoint with lease, we need this option.
for example:

    resp, err := cli.Grant(context.TODO(), 5)
    if err != nil {
        log.Fatal(err)
    }

    err = r.Update(context.TODO(), serviceName, naming.Update{Op:naming.Add, Addr: exposedAddr}, clientv3.WithLease(resp.ID))
    if err != nil {
        log.Fatalf(err)
    }
2016-11-09 15:30:17 -04:00
9c7a0a68e5 Merge pull request #6825 from gyuho/new
etcdserver: increase maxGapBetweenApplyAndCommitIndex
2016-11-09 10:32:51 -08:00
c817df1d32 etcdserver: increase maxGapBetweenApplyAndCommitIndex
This exists to prevent sending too many requests that
would lead into applier falling behind Raft accepting-proposal.

Based on recent benchmarks, etcd was able to process high workloads
(2 million writes with 1K concurrent clients).

The limit 1000 is too conservative to test those high workloads.
2016-11-09 09:44:11 -08:00
dbb692e50f Merge pull request #6820 from gyuho/watcher
mvcc: return -1 for wrong watcher range key >= end
2016-11-08 17:36:19 -08:00
0f5d9f00ad Merge pull request #6808 from fanminshi/functional-tester-compaction-deadline-fix
etcd-tester: increase compaction timeout limit
2016-11-08 17:18:40 -08:00
9dd75a946f clientv3, ctlv3: document range end requirement 2016-11-08 17:02:32 -08:00
396a71ee9e integration: test wrong watcher range 2016-11-08 17:02:32 -08:00
425acb28c4 mvcc: return -1 for wrong watcher range key >= end
Fix https://github.com/coreos/etcd/issues/6819.
2016-11-08 17:02:28 -08:00
107d7b663c etcd-tester: changed compaction timeout calculation
functional tester sometime experiences timeout during compaction phase. I changed the timeout calculation base on number of entries created and deleted.

FIX #6805
2016-11-08 17:00:04 -08:00
a93d8dfe62 Merge pull request #6821 from gyuho/manual
*: fix minor typos, styles
2016-11-08 15:39:26 -08:00
510676fea9 Merge pull request #6816 from heyitsanthony/fix-disconn-cancel
clientv3: let watchers cancel when reconnecting
2016-11-08 13:27:50 -08:00
2955d58776 clientv3/integration: fix minor typos, consistent formatting 2016-11-08 12:37:33 -08:00
754daf918b clustering.md: update minor grammar 2016-11-08 12:34:43 -08:00
1a969ffc52 Merge pull request #6812 from feisan/dev-kvkv
Documentation: fixed  typo
2016-11-08 12:33:06 -08:00
1aeeb38459 clientv3: let watchers cancel when reconnecting 2016-11-08 12:02:17 -08:00
7b5e5eadb1 integration: test canceling a watcher on disconnected stream 2016-11-08 12:02:17 -08:00
b2f8d8c397 Merge pull request #6817 from heyitsanthony/build-tag-etcdtop
etcd-top: make build require -tags pcap
2016-11-08 08:12:22 -08:00
57fb2a2b35 vendor: unvendor gopcap so travis CI works 2016-11-07 16:17:52 -08:00
2af31f99c3 etcd-top: make build require -tags pcap
Fixes travis.
2016-11-07 15:54:40 -08:00
97ac128fef Documentation: fixed typo 2016-11-07 19:14:34 -04:00
c9cc1efb67 Merge pull request #6815 from bdarnell/transfer-non-member
raft: Check promotable() in MsgTimeoutNow handling
2016-11-07 10:33:57 -08:00
2f34547d39 raft: Check promotable() in MsgTimeoutNow handling
If MsgTimeoutNow arrived after a node was removed, the node could start
and win an election, then panic in becomeLeader (see
cockroachdb/cockroach#8535)
2016-11-07 20:02:21 +08:00
ecd4803ccc Merge pull request #6809 from hongchaodeng/doc
readme: add 'run etcd on k8s' section
2016-11-04 22:59:42 -07:00
011c452b65 readme: add 'run etcd on k8s' section 2016-11-04 20:28:14 -07:00
352d4fa3fa Merge pull request #6804 from heyitsanthony/stm-conflict
grpcproxy: invalidate comparison keys after txn
2016-11-04 17:04:16 -07:00
476ff67047 Merge pull request #6807 from xiang90/fix_raft_test
rafttest: make raft test reliable
2016-11-04 16:31:32 -07:00
e5987dea37 rafttest: make raft test reliable 2016-11-04 15:55:17 -07:00
91360e1495 Merge pull request #6806 from gyuho/metrics
v3rpc: add 'active' gRPC streamsGauge
2016-11-04 12:28:55 -07:00
67082e5bd1 v3rpc: add gRPC active streamsGauge 2016-11-04 11:09:20 -07:00
bf08a6142c grpcproxy: invalidate comparison keys after txn
If the txn comparison block makes claims about a key's current
state, then it may say a key has been updated. Future range/txn
operations may expect this update to eventually be propagated through
the cluster and show up in serialized requests. To avoid spinning
forever on txn/serialized range loops, invalidate the comparison keys.
2016-11-04 09:46:43 -07:00
27459425fa Merge pull request #6799 from gyuho/log-output
etcdmain: configurable 'etcd' binary log-output
2016-11-03 14:52:53 -07:00
a0d206c51f Merge pull request #6801 from xiang90/fix_snap_test
etcdserver: make snaptest fail fast
2016-11-03 14:49:42 -07:00
6a0a0a7ea1 etcdserver: make snaptest fail fast 2016-11-03 14:44:08 -07:00
6ffd7e3ed1 etcdmain: configurable 'etcd' binary log-output
Fix https://github.com/coreos/etcd/issues/5449.
2016-11-03 14:18:12 -07:00
aa526cd53d Merge pull request #6798 from gyuho/ctl-doc
etcdctl/ctlv3: clarify 'user add' argument (user:password)
2016-11-03 11:06:40 -07:00
31a6efbc13 etcdctl/ctlv3: clarify 'user add' argument (user:password) 2016-11-03 10:47:45 -07:00
f82aac2fc6 Merge pull request #6797 from fanminshi/lease_checker_println_fix
etcd-tester: fix lease checker logging format.
2016-11-03 10:17:54 -07:00
b7ab5c6384 Merge pull request #6788 from fanminshi/lease_http_eof_fix
etcd-tester: add retry logic on retriving lease info
2016-11-03 10:14:36 -07:00
6968028020 etcd-tester: fix lease checker logging format.
lease checker used a wrong print format for a variable. this change fixes it.
2016-11-03 10:11:00 -07:00
649fe7f2af etcd-tester: add retry logic on retriving lease info
getting lease and keys info through raw rpcs rarely experience error such as EOF. This is considered as a failure and causes tester to clean up.
however, they are just transient problem with temporary connection issue which should not be considered as a testing failure. so we add retry logic in case of transient failure.

FIX #6754
2016-11-03 10:05:06 -07:00
b28b38fb6d Merge pull request #6793 from timothysc/no-ttl
Add a no-ttl flag to etcdctl migrate to discard keys on transform.
2016-11-03 09:00:53 -07:00
c5ac02164d Merge pull request #6794 from xiang90/fix_migration
ctlv3: fix migration
2016-11-03 08:43:30 -07:00
97e96feb1d ctlv3: Add a no-ttl flag to etcdctl migrate to discard keys on transform. 2016-11-03 10:41:54 -05:00
4d2ec2fec1 Merge pull request #6792 from gyuho/leasehttp
leasehttp: use graceful close, add tests, remove TODO
2016-11-02 22:55:06 -07:00
bbc1cdafef Merge pull request #6791 from gyuho/grpc-leader
etcdserver: translate EOF to ErrNoLeader for renew, timetolive
2016-11-02 22:54:46 -07:00
cc304ac03c etcdserver: translate EOF to ErrNoLeader for renew, timetolive
Address https://github.com/coreos/etcd/issues/6754.

In case there are network errors or unexpected EOF errors
in TimeToLive http requests to leader, we translate that into
ErrNoLeader, and expects the client to retry its request.
2016-11-02 22:22:05 -07:00
2fb2b463a3 Merge pull request #6786 from mitake/empty-user
auth, etcdserver: forbid adding a user with empty name
2016-11-02 22:10:58 -07:00
f85701a46f auth, etcdserver: forbid adding a user with empty name 2016-11-03 13:45:39 +09:00
2ba42990ec ctlv3: fix migration 2016-11-02 20:00:07 -07:00
c931f4d164 leasehttp: use graceful close, add tests, remove TODO 2016-11-02 16:33:26 -07:00
378257161f Merge pull request #6789 from heyitsanthony/grpcproxy-close-send
grpcproxy: reliably track rid in watchergroups
2016-11-02 15:29:08 -07:00
fe755b6250 Merge pull request #6748 from sinsharat/client_metric_add_tests
clientv3: added test for client metrics
2016-11-02 15:00:08 -07:00
844378f0a7 Merge pull request #6790 from sinsharat/clientv3_metrics_doc_update
clientv3: updated doc for metric support
2016-11-02 14:56:04 -07:00
195570b621 clientv3: updated doc for metric support 2016-11-03 03:22:59 +05:30
8ec4215279 grpcproxy: reliably track rid in watchergroups
Couldn't find watcher group from rid on server stream close, leading to
the watcher group sending on a closed channel.

Also got rid of send closing the watcher stream if the buffer is full,
this could lead to a send after close while broadcasting to all receivers.
Instead, if a send times out then the server stream is canceled.

Fixes #6739
2016-11-02 14:42:02 -07:00
13acad85b3 clientv3: added test for client metrics 2016-11-03 00:38:29 +05:30
7d777a4a64 Merge pull request #6784 from xiang90/lock_warning
etcdserver: print out warning when waiting for file lock
2016-11-02 10:34:52 -07:00
5b7728f3cb Merge pull request #5994 from gyuho/v2-error
etcdctl/ctlv2: error handling with JSON
2016-11-01 21:51:16 -07:00
9b470ef4c0 etcdctl/ctlv2: error handling with JSON 2016-11-01 20:59:13 -07:00
c33d04fb54 etcdserver: print out warning when waiting for file lock 2016-11-01 17:55:16 -07:00
71bad561e8 Merge pull request #6782 from xiang90/v2store
store: do not modify key during scanning
2016-11-01 15:41:03 -07:00
43045500b2 store: do not modify key during scanning 2016-11-01 14:35:53 -07:00
71a533fec3 Merge pull request #6771 from fanminshi/refactor_short_lived_lease_logic
etcd-tester: refactor checking short lived lease logic
2016-11-01 14:17:43 -07:00
4f60f1b71f Merge pull request #6708 from bluepeppers/leader-sync-deadlock
client: Prevent deadlocks in Sync
2016-11-01 14:11:21 -07:00
8a03c95dd4 etcd-tester: refactor checking short lived lease logic
move the logic of waiting lease expired from stresser to checker
2016-11-01 14:06:22 -07:00
b30dc10812 Merge pull request #6770 from heyitsanthony/fix-grpc
grpcproxy: add SetHeader support to ServerStream
2016-11-01 13:55:07 -07:00
7ef17d3e97 grpcproxy: add SetHeader support to ServerStream
Fixes #6726
2016-11-01 13:28:02 -07:00
4575353693 Merge pull request #6768 from gyuho/wtwt
clientv3/integration: close active connection to get ErrClientConnClosing
2016-11-01 12:37:04 -07:00
0684d8c4c6 clientv3/integration: close active connection to get ErrClientConnClosing
because clientv3.Close won't trigger it any more

clientv3.Close just closes watch client
instead of closing grpc connection
2016-11-01 11:13:33 -07:00
94c804b81a Merge pull request #6766 from fanminshi/stabilization-logic-refractoring
functional-tester: remove stablilization limit
2016-11-01 10:43:00 -07:00
de008c8a4a client: prevent deadlock in Sync 2016-11-01 17:26:53 +00:00
c781f30ed5 functional-tester: remove stablilization limit
This change removes the waiting needed to ensure the cluster to be stable.

FIX #6760
2016-11-01 10:01:59 -07:00
db83736b7b Merge pull request #6774 from mitake/linearizable-password-checking
etcdserver: linearizable password checking at the API layer
2016-11-01 07:51:27 -07:00
fdf433024f etcdserver: linearizable password checking at the API layer
For avoiding a schedule that can cause an inconsistent auth store [1],
password checking must be done in a linearizable manner.

Fixes https://github.com/coreos/etcd/issues/6675 and https://github.com/coreos/etcd/issues/6683

[1] https://github.com/coreos/etcd/issues/6675#issuecomment-255006389
2016-11-01 00:02:33 -07:00
136c02da71 Merge pull request #6738 from gyuho/raft-cleanup
etcdserver: move 'EtcdServer.send' to raft.go
2016-10-31 15:15:08 -07:00
b64de4707d Merge pull request #6724 from johnbazan/ctlv3_add_user_with_password_inline
etcdctl: allow to add a user within one command line
2016-10-31 14:31:09 -07:00
d51a7dba43 etcdctl: Adding e2e tests for userAddTest 2016-10-31 18:14:29 -03:00
73b4a58ac0 etcdctl: allow to add a user within one command line
This makes the "user add usr:pwd" feature available for ctlv3
without asking for the password in a new prompt.
2016-10-31 18:14:19 -03:00
4969a0e9e7 Merge pull request #6758 from heyitsanthony/move-checker
etcd-tester: refactor stresser / checker management
2016-10-31 13:59:54 -07:00
308f2a1695 etcd-tester: refactor stresser/checker organization
The checkers and stressers should be composable without special cases; this
patch tries to address that while refactoring out some old cruft.

Namely,
* Single stresser/checker for a tester; built from composition
* Composite stresser via comma-separated list of stressers
* Split stressers into separate files
* Removed v2 only flags and special cases
* Rate limiter shared among key stresser and leases stresser
* Composite checker is now concurrent
* Stresser can return a Checker to check its invariants
* Each lease checker only operates on a single lease stresser
2016-10-31 13:59:04 -07:00
72fc5f7d1b Merge pull request #6765 from xiang90/s
etcd-runner: move string generation to pkg/stringutil
2016-10-31 12:37:18 -07:00
9f0ee53e86 etcd-runner: move string generation to pkg/stringutil 2016-10-31 12:20:02 -07:00
30d37b2165 Merge pull request #6763 from gyuho/spell
*: fix minor typos
2016-10-31 10:43:17 -07:00
5bd00ab1f6 *: fix minor typos 2016-10-31 09:47:15 -07:00
a1a2d2b1e7 Merge pull request #6762 from gyuho/doc
op-guide: 'strict-reconfig-check' true by default
2016-10-31 09:44:57 -07:00
7e06a95942 Merge pull request #6759 from xiang90/tester
etcd-runner: refactor code structure and flag cleanup
2016-10-31 09:07:18 -07:00
4a42c72b5e op-guide: 'strict-reconfig-check' true by default 2016-10-31 07:59:33 -07:00
e5c3978725 etcd-runner: refactor code structure and flag cleanup 2016-10-30 18:45:16 -07:00
86c4a74139 etcd-tester: move stresser and checker to tester
These really belong in tester code; the stressers and
checkers are higher order operations that are orchestrated
by the tester. They're not really cluster primitives.
2016-10-29 10:57:17 -07:00
4a08678ce1 Merge pull request #6749 from gyuho/raft-prevote
raft: do not attach term to MsgReadIndex
2016-10-28 22:29:08 -07:00
cb5c92f69b raft: do not attach term to MsgReadIndex
Fix https://github.com/coreos/etcd/issues/6744.

MsgReadIndex, as MsgProp, is to be forwarded to leader.
So we should treat it as local message.
2016-10-28 22:12:25 -07:00
0345226759 Merge pull request #6751 from heyitsanthony/fix-require-leader-test
integration: put key on watch target member for TestWatchWithRequireLeader
2016-10-28 23:24:00 -04:00
bc3e056b4a Merge pull request #6755 from fanminshi/log-statement-fix
functional-tester: fix log statement
2016-10-28 14:40:11 -07:00
34c906be55 functional-tester: fix log statement
simple fix for wrongly printed statement.
2016-10-28 14:27:09 -07:00
d8ea9d22b6 integration: put key on watch target member for TestWatchWithRequireLeader
It's possible the put will not propagate to all members before removing quorum,
causing watches on the key to wait forever.

Fixes #6386
2016-10-28 13:12:26 -04:00
a0360a83c9 Merge pull request #6745 from heyitsanthony/private-recipe
contrib/recipes: unexport and clean up keys.go
2016-10-28 12:44:58 -04:00
8f718e2e5a contrib/recipes: unexport and clean up keys.go
Fixes #6731
2016-10-28 11:41:13 -04:00
c6cd63dc35 Merge pull request #6747 from sinsharat/client_metric_add_example
clientv3: added example for client metrics
2016-10-27 16:42:18 -07:00
a1bfb31219 clientv3: added example for client metrics 2016-10-28 04:30:17 +05:30
ea05711522 Merge pull request #6746 from fanminshi/tester-recovery-error-fix
functional-tester: always clean up if tester encouters an error
2016-10-27 15:22:07 -07:00
7f5a7d1da5 functional-tester: always clean up if tester encouters an error
The current tester doesn't not clean up if any of the failure injection/recovery fails. if tester fails to recover a dead node, tester hangs in the next round because the tester will keep waiting until cluster becomes healthy which is impossible since a node is down. To fix this issue, we will always clean up if any error happens during each round so that cluster will be healthy for next round.

FIX #6743
2016-10-27 15:07:58 -07:00
89107a49fa Merge pull request #6741 from sinsharat/clientv3_add_client_side_metrics
clientv3: added client side metrics support
2016-10-27 13:22:10 -07:00
1b36162659 Merge pull request #6647 from gyuho/watch
clientv3: send create event over outc
2016-10-27 11:45:22 -07:00
0a3d45a307 clientv3: send create event over outc 2016-10-27 11:11:16 -07:00
8fd1dd7862 clientv3: added client side metrics support 2016-10-27 22:47:45 +05:30
c99a9f4075 clientv3: added entries for go-grpc-prometheus for build 2016-10-27 22:44:19 +05:30
2c974abcb9 clientv3: added go-grpc-prometheus for client meterics 2016-10-27 22:36:05 +05:30
6ec03d3f7c etcdserver: move 'EtcdServer.send' to raft.go
Clear 'TODO'
2016-10-26 16:26:00 -07:00
8825392da2 Merge pull request #6714 from fanminshi/short_term_lease_check
functional-tester: add short lived leases checking
2016-10-26 14:50:55 -07:00
8d9e2623e1 functional-tester: add short lived leases checking
lease stresser now generates short lived leases that will expire before invariant checking.
this addition verifies that the expired leases are indeed being deleted on the sever side.
2016-10-26 14:46:57 -07:00
d7c21e6837 Merge pull request #6737 from fanminshi/lease_expired_fix
functional-tester: increase lease TTL
2016-10-26 10:34:38 -07:00
1dc60bb97e functional-tester: increase lease TTL
increasing lease TTL ensure that lease doesn't expire during hashes stabilization period.
I observed that it can take a long time for etcd cluster to become stable.
2016-10-26 10:32:52 -07:00
c58ae95429 Merge pull request #6732 from JoshRosso/etcdctl-specify-ttl-unit
etcdctl: add ttl unit to flag description
2016-10-25 18:08:02 -07:00
e489229153 etcdctl: add ttl unit to flag description
Add the ttl unit (seconds) to --ttl description for etcdctl's mk, mkdir, set,
setdir, update, and updatedir commands.
2016-10-25 17:12:15 -07:00
12e4dfa9c4 Merge pull request #6715 from fanminshi/lease_hash_fix
Lease hash fix
2016-10-25 16:28:11 -07:00
b398233f4f Merge pull request #6730 from gyuho/round-prefix
etcd-runner: fix typo in round prefix
2016-10-25 16:24:27 -07:00
99539ff031 Merge pull request #6727 from sinsharat/etcd-runner_add_client_timeout_flag
etcd-runner: Added connection timeout flag for client
2016-10-25 16:09:10 -07:00
12488d4a70 etcd-runner: fix typo in round prefix 2016-10-25 15:59:44 -07:00
bb97adda0d functional-tester: add retries to hash checking
allows hashes to converge through retrying.
2016-10-25 14:27:36 -07:00
6e7e346c93 etcd-runner: Added client connection timeout flag 2016-10-26 01:58:22 +05:30
d7bc15300b Merge pull request #6624 from bdarnell/pre-vote
raft: Implement the PreVote RPC described in thesis section 9.6
2016-10-25 13:18:22 -07:00
ddc94aaf9e Merge pull request #6725 from gyuho/btree
vendor: backport 'google/btree' changes
2016-10-25 12:34:52 -07:00
4ba63237ce Merge pull request #6705 from gyuho/dump-db
etcd-dump-db: initial commit
2016-10-25 11:43:42 -07:00
cd618323d0 vendor: backport 'google/btree' changes 2016-10-25 10:53:35 -07:00
924ece6ae7 etcd-dump-db: initial commit 2016-10-25 10:18:51 -07:00
4e1d3f0f52 mvcc: expose 'backend.IgnoreKey' 2016-10-25 10:07:08 -07:00
18739e766a Merge pull request #6721 from sinsharat/etcd_runner_remove_unused_code
etcd-runner: remove unused code and change name for randClient
2016-10-25 10:01:20 -07:00
efeceef0ca Merge pull request #6703 from FedericoCeratto/patch-1
Add etcd client for Nim
2016-10-25 09:28:08 -07:00
8d5e969f12 raft: Separate test methods for vote and pre-vote tests 2016-10-25 23:31:44 +09:00
69ae49a1dd etcd-runner: remove unused code and change name for randClient 2016-10-25 19:16:27 +05:30
f2cff42cb8 libraries-and-tools: add Nim client 2016-10-25 12:20:42 +01:00
8e5f34fd97 Merge pull request #6709 from yudai/error_url
clientv3: Fix URL to rpc errors
2016-10-24 18:52:46 -07:00
e7e29ba249 clientv3: Fix URL to rpc errors 2016-10-24 16:38:05 -07:00
c549978b8e Merge pull request #6712 from doodles526/change_boom_to_hey
hack/benchmark: change boom to hey
2016-10-24 14:51:58 -07:00
4eefdaa4bb hack/benchmark: change boom to hey
boom has moved to hey, due to a conflicting binary name with another
project
2016-10-24 13:28:53 -07:00
0bb2384547 Merge pull request #6707 from mkumatag/ppc64le_support
Add ppc64le travis builds
2016-10-24 12:33:21 -07:00
77be124391 Merge pull request #6691 from xiang90/fix_retry
clientv3: do not retry on mutable operations
2016-10-24 10:32:34 -07:00
0642b4e61e travis: Add ppc64le travis builds 2016-10-24 22:46:14 +05:30
a1afb21e33 Merge pull request #6710 from fasaxc/report-cluster-id
client: Return the server's cluster ID as part of the Response
2016-10-24 09:46:57 -07:00
06e2ce116c Merge pull request #6704 from heyitsanthony/proxy-broadcast-race
grpcproxy: fix race on watcher revision
2016-10-24 09:17:29 -07:00
ae99c91903 Merge pull request #6698 from heyitsanthony/session-close
concurrency: terminate session.Close if revoke takes longer than TTL
2016-10-24 09:14:09 -07:00
43df091067 client: Return the server's cluster ID as part of the Response
This allows the client to spot if the cluster ID changes, which
would indicate that the cluster has been rebuilt and watches may be
out of sync.

Helps work around #6652.
2016-10-24 14:51:00 +01:00
22aa710c1f raft: Improve comments and formatting for PreVote change 2016-10-24 22:29:33 +09:00
81f151eed2 clientv3: fix retry logic
1. Balancer should setup gRPC error code correctly for retry.

2. We should not mask context error.
2016-10-22 22:15:43 -07:00
92c987f75d Merge pull request #6695 from sinsharat/watch_runner_respect_rounds
etcd-runner: watcher runner respect rounds
2016-10-21 20:28:29 -07:00
90146d863c etcd-runner: watcher runner respect rounds 2016-10-22 05:00:10 +05:30
cb3a5eaff1 Merge pull request #6702 from heyitsanthony/minmax-proxy
grpcproxy: respect {min,max}{create,mod} revision
2016-10-21 16:29:40 -07:00
a5c93840b4 Merge pull request #6701 from heyitsanthony/compact-resume
integration: account for unsynced server in TestWatchResumeCompacted
2016-10-21 16:29:27 -07:00
f38a5d19a8 concurrency: add WithContext option to sessions
Makes it possible to cancel session requests without having to
close the entire client.
2016-10-21 16:26:59 -07:00
1e330a90c7 concurrency: terminate session.Close if revoke takes longer than TTL
Fixes #6681
2016-10-21 16:21:01 -07:00
bd1985d84b grpcproxy: fix race on watcher revision
Was racing between broadcast setting the watchgroup revision
and joining single watchers.
2016-10-21 16:09:39 -07:00
65eb3038fe grpcproxy: respect {min,max}{create,mod} revision
Mutexes were breaking in proxy integration tests.
2016-10-21 15:02:00 -07:00
8f3abda5b8 integration: account for unsynced server in TestWatchResumeCompacted
The watch's etcd server is shutdown to keep the watch in a retry state as
keys are put and compacted on the cluster. When the server restarts,
there is a window where the compact hasn't been applied which may cause
the watch to receive all events instead of only a compaction error.

Fixes #6535
2016-10-21 13:42:10 -07:00
21e65eec08 Merge pull request #6692 from fanminshi/lease_expire_compact_fix
functional-tester: add rate limiter to lease stresser
2016-10-21 12:47:27 -07:00
d582fdcc1b functional-tester: add rate limiter to lease stresser
too many leases created can cause compaction to timeout. adding a rate limiter limits number of leases and attched keys.
2016-10-21 12:34:49 -07:00
1ad038d02e Merge pull request #6662 from gyuho/db-panic
backend: skip *bolt.DB.Size call when nil
2016-10-21 11:38:54 -07:00
ef9d55800f integration: test inflight Hash call on nil db 2016-10-21 11:02:54 -07:00
994e8e4f40 mvcc: test inflight Hash to trigger Size on nil db 2016-10-21 11:02:09 -07:00
7d30326968 backend: skip *bolt.DB.Size call when nil
Fix https://github.com/coreos/etcdlabs/issues/30.
2016-10-21 11:01:23 -07:00
791aeb39a6 Merge pull request #6653 from gyuho/acbuild
acbuild: add symlinks to /usr/local/bin/etcd*
2016-10-21 10:48:08 -07:00
60c0a5503e Merge pull request #6636 from heyitsanthony/watch-resume-close
clientv3: only receive from closing streams in Watcher close
2016-10-21 10:06:03 -07:00
b72a413b71 Merge pull request #6697 from gyuho/fmt
*: fix gofmt issues with go tip
2016-10-20 17:04:39 -07:00
0626ee048e rafthttp: fix gofmt issues with go tip 2016-10-20 16:32:56 -07:00
46716fe9fb mvcc: fix gofmt issues from Go tip 2016-10-20 16:32:47 -07:00
161eb2c457 Merge pull request #6696 from fanminshi/lease_expire_fix
functional-tester: modify lease renew logic
2016-10-20 16:25:04 -07:00
c100e40715 clientv3: only receive from closing streams in Watcher close
Was overcounting the number of expected closing messages; the resuming
list may have nil entries. Also the full client wasn't closing the watcher
client, only canceling its context, so client closes weren't joining with
the watcher shutdown.

Fixes #6605
2016-10-20 15:33:11 -07:00
a66c25121b integration: stress closing while resuming watchers 2016-10-20 15:33:11 -07:00
a25d4ac821 functional-tester: modify lease renew logic
only renew a lease if the lease is present.
2016-10-20 15:27:46 -07:00
a2cfb56581 Merge pull request #6689 from fanminshi/function-tester-ensure-etcd-fullly-restarted
functional-tester: add logic to ensure etcd node is alive after fault recovery returns
2016-10-20 13:30:48 -07:00
0f1eb14374 Merge pull request #6694 from gyuho/travis
travis: test with Go 1.7.3
2016-10-20 12:04:01 -07:00
dd1920883c Merge pull request #6693 from philips/fix-nb
Documentation: admin guide remove NB
2016-10-20 11:58:32 -07:00
9e6912fe82 travis: test with Go 1.7.3
Go 1.7.3 released.
2016-10-20 11:56:10 -07:00
e719d8641e Merge pull request #6688 from gyuho/compact-rev
e2e: compact with latest rev in alarm test
2016-10-20 11:54:04 -07:00
970abbb60a Documentation: admin guide remove NB
I have no idea what NB means but just change it to Note
2016-10-20 11:47:41 -07:00
9bfbc12d7d e2e: compact with latest rev in alarm test
Fix https://github.com/coreos/etcd/issues/6677.
2016-10-20 11:06:30 -07:00
a47797fdf1 Merge pull request #6690 from hongchaodeng/f
etcdctl: fix migrate in outputing client.Node to json
2016-10-20 10:50:58 -07:00
9205a242b9 clientv3: do not retry on mutable operations 2016-10-20 10:48:10 -07:00
94ea82c00d functional-tester: add logic to ensure etcd node is alive after fault recovery returns
failure recovery needs to wait etcd node to become alive before returning

FIX #6654
2016-10-20 10:31:08 -07:00
b3f0eeabe4 etcdctl: fix migrate in outputing client.Node to json
Using printf will try to parse the string and replace special
characters. In migrate code, we want to just output the raw
json string of client.Node.
For example,
    Printf("%\\") => %!\(MISSING)
    Print("%\\") => %\
Thus, we should use print instead.
2016-10-20 10:03:45 -07:00
6b1b13eabb Merge pull request #6687 from sinsharat/build_add_option_for_binary_stripping
build: add option to enable binaries stripping for windows
2016-10-19 13:25:25 -07:00
4dab78e72c Merge pull request #6680 from sinsharat/etcd_runner_make_run_watcher_fail_safe
etcd-runner: make run watcher fail safe
2016-10-19 13:24:14 -07:00
751a8d5b04 build: add option to enable binaries stripping for windows 2016-10-20 00:52:57 +05:30
50523e22d8 etcd-runner: make run watcher fail safe 2016-10-20 00:23:35 +05:30
e95b571e7c Merge pull request #6684 from gyuho/build-with-strip
release: build binary without symbols for debug
2016-10-19 10:33:50 -07:00
0bd9179835 release: build binary without symbols for debug 2016-10-19 09:45:10 -07:00
28a29d9ecd Merge pull request #6676 from nekto0n/build_args
build: add option to enable binaries stripping
2016-10-19 09:33:22 -07:00
46d4ff823f Merge pull request #6678 from manishrjain/master
raft: Add dgraph to the list of users
2016-10-19 09:31:07 -07:00
401ef96ace Merge pull request #6682 from sinsharat/update_txn_interactive_cmd_output
etcdctlv3: update txn interactive command output
2016-10-19 09:26:40 -07:00
00837b0736 etcdctlv3: update txn interactive command output 2016-10-19 19:55:09 +05:30
cf93a74aa8 raft: Refactor vote handling
Move all vote handling from the per-state step functions to the
top-level Step(). This wasn't necessary before because MsgVote would
cause us to become a follower, but MsgPreVote needs to be handled
without changing the node's current state.
2016-10-19 19:35:21 +08:00
73cae7abd0 raft: Implement the PreVote RPC described in thesis section 9.6
This prevents disruption when a node that has been partitioned
away rejoins the cluster.

Fixes #6522
2016-10-19 19:35:20 +08:00
ca87a13b18 raft: More realistic terms in tests
Some tests were starting nodes with a non-empty log but a term of zero,
which cannot happen in the real world. This was affecting the final term
being tested in TestLeaderElection.
2016-10-19 19:35:20 +08:00
10cead3139 test: Ignore gopath.proto in test script 2016-10-19 19:35:20 +08:00
255670106f raft: Add dgraph to the list of users
Because Dgraph is a notable user of RAFT.
2016-10-19 17:26:51 +11:00
c6ebc13b43 build: build unstripped binaries by default 2016-10-19 11:15:38 +05:00
11c38fb1eb Merge pull request #6661 from manishrjain/startnode
Update README to explain starting a single node cluster and joining it.
2016-10-18 21:03:28 -07:00
e69c2fd382 raft: update README to explain starting a single node cluster and joining it
this PR helps clients of RAFT set up the cluster correctly, when they're
starting with a single node cluster.
2016-10-19 14:09:48 +11:00
c9b7fc46ff Merge pull request #6672 from gyuho/etcdctl-sort-by
*: sort by ASCEND on missing sort order
2016-10-18 17:07:38 -07:00
f550af7ef4 integration: test sort ASCEND by default in range 2016-10-18 16:50:30 -07:00
4de2128344 clientv3/integration: test missing sort order get 2016-10-18 16:29:22 -07:00
3a6d4b7f12 e2e: test sort ASCEND when sort target is missing 2016-10-18 16:29:22 -07:00
1cd6fefd49 etcdserver: set sort ASCEND for empty sort order
when target is not key
2016-10-18 16:29:19 -07:00
20bdb315f5 Merge pull request #6670 from fanminshi/lease_stressor
functional-tester: add lease stresser
2016-10-18 14:38:42 -07:00
ab2b58a80f functional-tester: add lease stresser
Add lease stresser to test lease code under stress and etcd failures

resolve #6380
2016-10-18 14:20:26 -07:00
ed75d93625 Merge pull request #6666 from fanminshi/function-tester-refractor
functional-tester: move checker logic to cluster
2016-10-18 11:44:37 -07:00
7d86d1050e functional-tester: move checker logic to cluster
I move the checker logic from tester to cluster so that stressers and checkers can be initialized at the same time.
this is useful because some checker depends on stressers.
2016-10-18 11:17:40 -07:00
5c60478953 Merge pull request #6656 from yudai/balancer_fast_fail
clientv3: make balancer respect FastFail
2016-10-17 15:04:04 -07:00
6a33f0ffd5 clientv3: make balancer respect FastFail
The simpleBalancer.Get() blocks grpc.Invoke() even when the Invoke() is called
with the FailFast option. Therefore currently any requests with the
FastFail option actually doesn't fail fast. They get blocked when there is
no endpoints available.
Get() method needs to respect the BlockingWait option when
picks up an endpoint address from the list and fail immediately when the option is
enabled and no endpoint is available.
2016-10-17 14:11:51 -07:00
24c284160b Merge pull request #6635 from sinsharat/etcd_runner_add_watcher_runner
etcd-runner:added watch runner
2016-10-17 11:02:06 -07:00
8297322176 etcd-runner:added watch runner 2016-10-17 23:04:33 +05:30
7022d2d00c Merge pull request #6660 from gyuho/delete-all-keys
etcdctl/ctlv3: support del all keys with '--prefix'
2016-10-17 09:57:19 -07:00
75a65e1a70 e2e: add test cases for del all keys 2016-10-17 09:34:21 -07:00
fac20b228d ctlv3: support del all keys by '--prefix' 2016-10-17 09:33:59 -07:00
5457c029d7 Merge pull request #6640 from mitake/bcrypt-async
auth, etcdserver: check password at API layer
2016-10-17 09:24:34 -07:00
39e9b1f75a auth, etcdserver: check password at API layer
The cost of bcrypt password checking is quite high (almost 100ms on a
modern machine) so executing it in apply loop will be
problematic. This commit exclude the checking mechanism to the API
layer. The password checking is validated with the OCC like way
similar to the auth of serializable get.

This commit also removes a unit test of Authenticate RPC from
auth/store_test.go. It is because the RPC now accepts an auth request
unconditionally and delegates the checking functionality to
authStore.CheckPassword() (so a unit test for CheckPassword() is
added). The combination of the two functionalities can be tested by
e2e (e.g. TestCtlV3AuthWriteKey).

Fixes https://github.com/coreos/etcd/issues/6530
2016-10-17 14:18:21 +09:00
cc96f91156 Merge pull request #6659 from kragniz/python-client
Add link to python-etcd3
2016-10-16 15:20:37 -07:00
1e29715185 Documentation: add link to python-etcd3 2016-10-16 20:55:53 +01:00
e1547a775b Merge pull request #6646 from MartyMacGyver/windows_build_cleanup
build: Windows build cleanup
2016-10-14 18:57:36 -07:00
698a789644 Merge pull request #6655 from kragniz/range_end-docs
etcdserver: document DeleteRangeRequest prefixes
2016-10-14 15:00:24 -07:00
f2b953d4f7 version: bump up to v3.1.0-rc.0+git 2016-10-14 14:53:13 -07:00
8334790777 Merge pull request #6657 from gyuho/build
*: fix build script, bump up version
2016-10-14 14:40:17 -07:00
a81997ac3f version: bump up to v3.1.0-rc.0 2016-10-14 14:21:32 -07:00
06fd31cde9 build: get GitSHA first 2016-10-14 14:21:20 -07:00
4c444df7a6 Revert "version: bump to v3.1.0-rc.0"
This reverts commit cb178a78ea.
2016-10-14 14:20:33 -07:00
cb178a78ea version: bump to v3.1.0-rc.0 2016-10-14 14:06:21 -07:00
ce6276a2e8 etcdserver: document DeleteRangeRequest prefixes
There was missing info about deleting prefixes in the proto docs for
DeleteRangeRequest.

Closes #6641.
2016-10-14 21:39:03 +01:00
45588c1f9f Merge pull request #6650 from gyuho/flag
*: tests, README on environment variables in etcdctl v3
2016-10-14 12:15:27 -07:00
66f9e81c9a etcdctl: update README on environment variables 2016-10-14 11:58:59 -07:00
8081254498 e2e: add tests with environment vars for flags 2016-10-14 11:58:56 -07:00
a00ed609c3 pkg/flags: export 'FlagToEnv' for e2e tests 2016-10-14 11:15:28 -07:00
522be31192 acbuild: add symlinks to /usr/local/bin/etcd*
And uses latest acbuild (v0.4.0, --to-dir flag is deprecated).

For https://github.com/coreos/etcd/issues/6057.
2016-10-14 10:35:26 -07:00
77d6ecbc5f Merge pull request #6649 from fanminshi/discovery_max_wait
discovery: add upper limit for waiting on a retry
2016-10-14 09:46:08 -07:00
84508697ce Merge pull request #6639 from mitake/functional-tester-external
functional-tester: a new option -failure-wrapper for enabling/disabli…
2016-10-14 07:56:26 -07:00
296427fc78 build: Windows build cleanup
Remove spurious warnings and prompts
Make normal output quieter
Add filesys check (warn and exit on FAT* systems)

Fixes #5866
2016-10-14 02:07:53 -07:00
d1660b5ba3 Merge pull request #6619 from mitake/health-key
etcdctl, e2e: add --check-key option to endpoint health
2016-10-13 20:27:37 -07:00
eb9a01258e discovery: add upper limit for waiting on a retry
Adding upper limit ensures that expoential backoff doesn't reach more than 5 min on a re-try.

FIX #6648
2016-10-13 20:14:41 -07:00
d585b43abe etcdctl, e2e: add --check-key option to endpoint health
This commit adds a new option --check-key to endpoint health command
for health checking with a custom key. It is mainly for avoiding
permission problem.
2016-10-14 11:39:46 +09:00
b2b03d9926 functional-tester: a new option -failure-wrapper for enabling/disabling external fault injector
This commit adds a new option -failure-wrapper to etcd-tester. The
option receives a path of script that is used for enabling/disabling
external fault injectors. The script is called with an option "enable"
when it needs to be enabled (when failure.Inject() is called) and
called with "disabled" in an opposite case (when failure.Recover() is
called).
2016-10-14 11:31:28 +09:00
052e314372 build: Windows build cleanup
Remove spurious warnings and prompts; make output more informative

Fixes #5866
2016-10-13 13:05:24 -07:00
57008f1690 Merge pull request #6644 from kragniz/increase-warn-duration
etcdserver: increase warnApplyDuration from 10ms to 100ms
2016-10-13 10:58:58 -07:00
9df97eb441 etcdserver: increase warnApplyDuration from 10ms to 100ms
When running test suites for a client locally I'm getting spammed by log
lines such as:

    etcdserver: apply entries took too long [14.226771ms for 1 entries]

The comments in #6278 mention there were future plans of increasing the
threshold for logging these warnings, but it hadn't been done yet.
2016-10-13 17:55:50 +01:00
354891f75d Merge pull request #6634 from gyuho/manual
integration: add TestV3WatchWithPrevKV
2016-10-12 16:42:42 -07:00
c3948284a0 integration: add TestV3WatchWithPrevKV 2016-10-12 16:21:52 -07:00
614adb0230 Merge pull request #6628 from gyuho/fix-waitgroup
etcdserver: make WaitGroup.Add sync with Wait
2016-10-12 14:10:54 -07:00
546873f27e Merge pull request #6632 from heyitsanthony/grpc-naming
clientv3/naming: support resolving to multiple hosts
2016-10-12 13:18:36 -07:00
0c61d8804a etcdserver: make WaitGroup.Add sync with Wait 2016-10-12 13:11:35 -07:00
a97866b629 Merge pull request #6633 from xiang90/fix_rev_inconsistency
mvcc: fix rev inconsistency
2016-10-12 13:04:15 -07:00
3dbd30fcaa Documentation: add grpc naming resolver doc 2016-10-12 11:56:14 -07:00
7d50dc06a2 clientv3/naming: support resolving to multiple hosts
Previous implementation watches a single key so there's no way
to have separate hosts associate with separate keys for a single
grpc target. Instead, accept all keys on a prefix.

Also fixes first the Next() to read current name data from etcd instead
of waiting for the next event on a synced watcher.
2016-10-12 11:27:22 -07:00
93225ebafc mvcc: fix rev inconsistency
Try:

./etcdctl put foo bar
./etcdctl del foo
./etcdctl compact 3

restart etcd

./etcdctl get foo
mvcc: required revision has been compacted

The error is unexpected when range over the head revision.

Internally, we incorrectly set current revision smaller than the
compacted revision when we remove all keys around compacted revision.

This commit fixes the issue by recovering the current revision at least
to compacted revision.
2016-10-12 10:42:57 -07:00
cb9c77c4ba Merge pull request #6620 from nekto0n/put_update_optimize
Optimize updating key by storing lease in lessor
2016-10-12 09:47:11 -07:00
064e02f4b3 mvcc: Optimize updating key by storing lease in lessor 2016-10-12 09:37:09 +05:00
66f945c4bf Merge pull request #6629 from gyuho/clientv3-logger
clientv3: drop Config.Logger field
2016-10-11 17:01:13 -07:00
084c407a8d clientv3: drop Config.Logger field
Fix https://github.com/coreos/etcd/issues/6603.

Instead adds 'SetLogger' to set global logger interface
to avoid unnecessary logger updates.
2016-10-11 16:38:32 -07:00
e9f3101c49 Merge pull request #6625 from xiang90/grpc_proxy_doc
doc: add grpc proxy doc
2016-10-11 16:06:05 -07:00
17a6025ac8 doc: add grpc proxy doc 2016-10-11 15:15:45 -07:00
4c1a738caf Merge pull request #6627 from xiang90/apply_log
etcdserver: better panic logging
2016-10-11 14:44:46 -07:00
dbaa44372b etcdserver: better panic logging 2016-10-11 13:34:18 -07:00
c10dad41a3 Merge pull request #6604 from sinsharat/support_debug_build_using_delve_gdb
build: Added support for debugging using delve, gdb, etc
2016-10-11 13:03:35 -07:00
9ac2c8072a build: Added support for debugging using delve, gdb, etc 2016-10-12 01:00:15 +05:30
a7247b3c7e Merge pull request #6618 from heyitsanthony/fix-e2e-err-leak
e2e: close process if spawnWithExpects fails
2016-10-11 11:30:28 -07:00
2448f6a003 e2e: close process if spawnWithExpects fails
Was causing a process leak in TestCtlV3Alarm
2016-10-10 15:52:37 -07:00
d7f69d0f92 Merge pull request #6617 from gyuho/vendor-update
vendor: update glide and grpc-go
2016-10-10 14:48:17 -07:00
4a07bbec59 clientv3: implement new grpc.Balancer interface 2016-10-10 11:18:29 -07:00
e3558a64cf vendor: update grpc-go v1.0.2 tag
Fix https://github.com/coreos/etcd/issues/6529.
2016-10-10 11:18:01 -07:00
69ea359e62 vendor: update glide.yaml with grpc-go v1.0.2 tag 2016-10-10 11:17:47 -07:00
b9f3ef09e1 vendor: clean up dependencies (remove unused ones) 2016-10-10 11:17:27 -07:00
def1a3b77f script/updatedep: update glide, glide-vc version 2016-10-10 11:11:58 -07:00
3a6fe61c03 Merge pull request #6610 from heyitsanthony/bench-lease
benchmark: submit keepalive requests concurrently with report.Run()
2016-10-10 09:53:08 -07:00
fd60205e95 Merge pull request #6616 from bdarnell/genproto-gopath
scripts: Don't erase gopath.proto after genproto.sh
2016-10-10 09:19:49 -07:00
ef4e3ef55a scripts: Don't erase gopath.proto after genproto.sh
Wiping gopath.proto after a successful run does nothing but slow down
the next run unnecessarily as it downloads everything again.
2016-10-10 11:33:43 +08:00
602fd6a67e Merge pull request #6613 from mitake/ep-health
etcdctl: parse auth related options in endpoint health command
2016-10-09 06:58:06 -07:00
644ec0ddef etcdctl, e2e: parse auth related options in endpoint health command
Partially fixes https://github.com/coreos/etcd/issues/6611
2016-10-09 20:34:09 +09:00
c1d115b322 benchmark: submit keepalive requests concurrently with report.Run()
Otherwise report won't consume the results and the benchmark hangs.
2016-10-07 15:57:38 -07:00
ac4d39cfb0 Merge pull request #6583 from sinsharat/windows_etcd3.0.1_etcdctlv2api_issue_fix
etcdctlv2: windows compatibility issue fix for etcd v3.0.1
2016-10-07 13:58:57 -07:00
3f60ee0d27 Merge pull request #6590 from gyuho/etcdserver
etcdserver: separate EtcdServer from raftNode
2016-10-07 13:39:37 -07:00
e011ea25ca etcdserver: separate EtcdServer from raftNode 2016-10-07 13:18:39 -07:00
e1e16d9b28 Merge pull request #6608 from gyuho/news
NEWS: add 'prev-kv' feature for upcoming v3.0.11
2016-10-07 12:43:17 -07:00
ab2a20402e NEWS: add 'prev-kv' feature for upcoming v3.0.11 2016-10-07 11:22:02 -07:00
71f8f3ceb6 Merge pull request #6607 from glevand/for-merge-typo
Documentation: Minor typo fix
2016-10-07 10:31:38 -07:00
f1437a8932 Documentation: Minor typo fix
Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-10-07 10:17:43 -07:00
75f812eaa3 etcdctlv2: windows compatibility issue fix for etcd v3.0.1 2016-10-07 22:15:30 +05:30
4e4140040a Merge pull request #6602 from nekto0n/watchable_store_bench
mvcc: add BenchmarkWatchableStoreTxnPut benchmark
2016-10-07 09:13:44 -07:00
e2bd6f2213 Merge pull request #6601 from nekto0n/interval_tree_fast_stab
adt: fast path Stab in empty interval tree
2016-10-07 09:13:23 -07:00
f3cdfcdcf4 Merge pull request #6486 from glevand/for-merge-arm64
Get tests working on ARM64
2016-10-06 17:53:10 -07:00
686282393d Merge pull request #6600 from heyitsanthony/report
benchmark: split out report and add --precise option
2016-10-06 17:14:08 -07:00
e7d8292cd1 benchmark: add --precise flag
Usually benchmark writes with %4.4f; this adds optional %g formatting.
2016-10-06 16:18:47 -07:00
3d28faa3eb pkg/report, tools/benchmark: refactor report out of tools/benchmark
Only tracks time series when requested. Can configure output precision.
2016-10-06 16:18:47 -07:00
ea9e857eb9 Merge pull request #6599 from fanminshi/lease_error_type_fix
Lease: Add lease errors to togRPCError()
2016-10-06 15:47:51 -07:00
cbbd1f0f44 Merge pull request #6598 from xiang90/cleanup
v3rpc: return nil as error explicitly
2016-10-06 15:30:04 -07:00
a862fd9f0f Lease: Add lease errors to togRPCError()
This allows lease's function to convert lease error to appropriate GRPC errors
2016-10-06 14:29:31 -07:00
10cafe56b8 v3rpc: return nil as error explicitly 2016-10-06 14:14:43 -07:00
4a5fa261c6 Merge pull request #6596 from gyuho/protect-TTL
lease: add TTL() method
2016-10-06 11:41:51 -07:00
65ac718a11 etcdserver: use 'TTL()' on lease.Lease 2016-10-06 11:24:12 -07:00
5adca4a720 lease, leasehttp: add TTL() method
Fix https://github.com/coreos/etcd/issues/6595.
2016-10-06 11:24:09 -07:00
9970ded79f mvcc: add BenchmarkWatchableStoreTxnPut benchmark 2016-10-06 22:44:25 +05:00
eae70c9379 adt: fast path Stab in empty interval tree 2016-10-06 22:41:33 +05:00
b8079b7fc0 Merge pull request #6594 from heyitsanthony/e2e-etcdctl-timeout
e2e: print correct timeout for etcdctl tests
2016-10-06 10:40:46 -07:00
fa1e28102e Merge pull request #5316 from ajityagaty/too_many_allocs
mvcc: Reduce number of allocs in PUT when watchableStore has no watchers.
2016-10-06 09:47:59 -07:00
e28706d9e2 e2e: print correct timeout for etcdctl tests 2016-10-06 09:18:41 -07:00
cc04d80b09 Merge pull request #6578 from glevand/for-merge-serial
test: Run integration pass in series
2016-10-05 19:28:15 -07:00
54c252ee63 clientv3/kv_test: Fix quota test
Updates TestKVPutError.  Change the quota to work with systems
that have a 64 KiB page size. Increase the db sync wait time to
one second.  Also, add some comments for the hard coded value.

Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-10-05 16:41:06 -07:00
84d2ff93b0 integration/v3_grpc_test: Fix quota tests
Use the system page size to set the test quota size.  Also, change
a comment related to setting the node quota to be more clear.

Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-10-05 16:41:06 -07:00
de8adc9e03 e2e/ctl_v3_alarm_test: Fix quota test
Rework the over quota test to be more a realistic test.  Take into
consideration that the system page size will be different across
platforms.

Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-10-05 16:41:06 -07:00
8c60a532a6 e2e/ctl_v3_alarm_test: Use fixed small buf size
We just need a small chunk of data to test put, so to be
consistent across platforms use a fixed size of 64 bytes.

Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-10-05 16:41:06 -07:00
beb194967e Documentation: Improve quota example
Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-10-05 16:41:06 -07:00
bdbb32dfe8 Documentation: Set ETCDCTL_API for v3 features
Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-10-05 16:41:06 -07:00
b65a2cec18 Documentation: Clearify Space quota section
Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-10-05 16:41:06 -07:00
f0469f7f25 Merge pull request #6570 from xiang90/lease_expire
Fix lease expire
2016-10-05 15:49:45 -07:00
3cbc5285e0 test: Run integration pass in series
On slower or heavily loaded platforms running the integration pass in
parallel results in test timeout errors.

Rename the integration_pass function to integration_e2e_pass, and add two
new functions integration_pass and e2e_pass.

Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-10-05 15:35:14 -07:00
0f0c048e29 etcdserver: fix early lessor promotion issue
If we promote the lessor before finish applying all
entries from the last term, we might incorrectly renew
the already revoked leases.

Here is an example:

- Term 1: revoke lease A accepted by raft
- Old leader failed, new election happened
- Term 2: promote
- Term 2: keep alive A succeed. A now has 10 seconds TTL
- Term 2: revoke lease A from Term 1 got committed and applied
- Term 2: the lease A with 10 seconds TTL is revoked

To solve this, the new leader MUST apply all entries from old term
before promote its lessor to start accept renew requests.
2016-10-05 14:41:47 -07:00
279c103517 lease: fix lease expire and add a test 2016-10-05 14:41:47 -07:00
7f0d5946ff Merge pull request #6589 from heyitsanthony/etcdctl-lock-one-session
etcdctl: remove superfluous session in lock command
2016-10-05 14:36:11 -07:00
b980ab0c67 Merge pull request #6582 from heyitsanthony/fix-cancel-close
clientv3: only return closing error to watcher if context is not canceled
2016-10-05 13:37:29 -07:00
f2af08f5aa etcdctl: remove superfluous session in lock command 2016-10-05 13:30:36 -07:00
f67f8d3b31 Merge pull request #6587 from heyitsanthony/watch-fix-revrace
clientv3: fix race on watch initial revision
2016-10-05 10:55:31 -07:00
d5cd563ce7 Merge pull request #6572 from glevand/for-merge-release_pass
Fixes for release_pass test
2016-10-05 09:42:16 -07:00
06d5cf2d52 clientv3: fix race on watch initial revision
The initial revision was being updated in the substream goroutine defer;
this was racing with the resume path fetching the initial revision when
the substream closes during resume. Instead, update the initial revision
whenever the substream processes a new watch response. Since the substream
cannot receive a watch response while it is resuming, the write to the
initial revision is ordered to always happen after the resume read.

Fixes #6586
2016-10-05 09:36:06 -07:00
e285f599e2 clientv3: only return closing error to watcher if context is not canceled
Fixes #6503
2016-10-04 16:09:50 -07:00
8e1c989ec3 integration: test a canceled watch won't return a closing error 2016-10-04 14:47:40 -07:00
98897b7603 Merge pull request #6580 from spoonben/fix-404-docs
docs: link directly to github procfile
2016-10-04 13:54:10 -07:00
25f1088edd test: Fixes for release_pass
Some fixes related to release_pass:

o Create the output directory ./bin if it does not exist.
o Define the GOARCH variable if it is not defined.
o Simplify the race detection test.
o Download the relese archive based on GOARCH.
o If the release file is not found, return success.  This will allow the tests
  to continue.

Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-10-04 13:42:53 -07:00
9c5a32eb7a docs: link directly to github procfile
This is in response to https://github.com/coreos/docs/issues/822
Unfortunately because of how the doc sync works there has to be
a direct link here.
2016-10-04 13:42:17 -07:00
5269bbd277 Merge pull request #6513 from gyuho/manual
raft: refactor inflight
2016-10-04 13:31:43 -07:00
dc8bf26cd8 raft: refactor inflight 2016-10-04 13:12:16 -07:00
19122b463e Merge pull request #6525 from heyitsanthony/watcher-disconn
clientv3: simplify watcher synchronization
2016-10-04 11:21:12 -07:00
02f557068e Merge pull request #6576 from xiang90/fix_doc
doc: build should work for non-github users
2016-10-04 10:32:37 -07:00
904e5090fd doc: build should work for non-github users 2016-10-04 10:26:51 -07:00
5b50658118 clientv3: simplify watch synchronization
Was more complicated than it needed to be and didn't really work in the
first place. Restructured watcher registation to use a queue.
2016-10-03 16:56:14 -07:00
9ce398f8a6 integration: test canceling watchers when disconnected 2016-10-03 16:56:14 -07:00
33e4f2ea28 Merge pull request #6563 from gyuho/gen-proto
scripts/genproto: use 'gopath.proto' for $GOPATH
2016-10-03 15:54:38 -07:00
9b56e51ca7 *: regenerate proto + gofmt change 2016-10-03 15:34:34 -07:00
8174fcf201 scripts/genproto: use 'gopath.proto' for $GOPATH 2016-10-03 15:34:31 -07:00
dfe85b26cc Merge pull request #6571 from xiang90/log_pkg
*: set repo correctly for logging
2016-10-03 15:49:44 -05:00
b7f02a8c0a Merge pull request #6568 from gyuho/e2e
e2e: test 'https' scheme endpoints
2016-10-03 13:24:00 -07:00
29dd3cf5bd Revert "clientv3/integration: add TestDialWithHTTPS"
This reverts commit a96a28d603.
2016-10-03 13:05:08 -07:00
0dc14d1771 e2e: test 'https' scheme endpoints 2016-10-03 13:04:58 -07:00
c26ebe3262 Merge pull request #6453 from vimalk78/wal-optimize-marshal-outside-lock
wal/wal.go: optimized WAL.SaveSnapshot to do Marshal outside the mutex lock
2016-10-03 11:50:11 -07:00
dd607b5eff Merge pull request #6560 from gyuho/scheme
clientv3: handle 'https' scheme in endpoint
2016-10-03 09:44:46 -07:00
a96a28d603 clientv3/integration: add TestDialWithHTTPS 2016-10-03 02:16:07 -07:00
962433c17f *: set repo correctly for logging 2016-10-03 17:03:22 +08:00
f45542394b clientv3: handle 'https' scheme in endpoint 2016-10-03 01:03:28 -07:00
02912fe8c4 Merge pull request #6564 from gyuho/nil-ref
e2e: skip when 'etcdProcess' is nil
2016-10-01 01:32:56 -07:00
5c51c600aa e2e: skip when 'etcdProcess' is nil 2016-10-01 00:45:28 -07:00
37bd0932f7 Merge pull request #6557 from heyitsanthony/fix-publish-retry
etcdserver: use stream recorder for TestPublishRetry
2016-09-30 18:40:18 -07:00
613525f711 Merge pull request #6559 from heyitsanthony/fix-lease-hash
lessor: delete keys in deterministic order on revoke
2016-09-30 17:23:00 -07:00
4f9be94643 lessor: delete keys in deterministic order on revoke
Fixes #6558
2016-09-30 16:45:52 -07:00
289e3c0c63 etcdserver: use stream recorder for TestPublishRetry
Fixes #6546
2016-09-30 15:43:32 -07:00
7225c77a3b Merge pull request #6556 from gyuho/simplify
lease: remove redundant lookup methods
2016-09-30 11:19:14 -07:00
4871a4a5f3 lease: remove redundant get method 2016-09-30 10:27:27 -07:00
c349e089b1 Merge pull request #6550 from heyitsanthony/watch-prog-notify
clientv3: make IsProgressNotify() false on compact event and closed channel
2016-09-29 11:12:25 -07:00
6ac284a577 grpcproxy: use valid progress notification in broadcast test 2016-09-29 10:45:25 -07:00
dac6e700f8 Merge pull request #6519 from mitake/functional-tester
functional-tester: decoupling functionalities of etcd-tester
2016-09-29 10:07:47 -07:00
868617ef86 Merge pull request #6548 from gyuho/get-config-embed
embed: add 'Config' method
2016-09-29 09:14:32 -05:00
b8017004ba embed: add 'Config' method 2016-09-29 07:10:59 -07:00
a781f4ebda Merge pull request #6551 from xiang90/fix_log_repo
pkg: use etcd as logging repo
2016-09-29 02:46:57 -05:00
9473e9c30e pkg: use etcd as logging repo 2016-09-29 15:29:38 +08:00
d80c13555a Merge pull request #6543 from xiang90/improve_txn
etcdserver: use linearizableReadNotify for txn
2016-09-28 19:54:30 -05:00
bf2581390d clientv3: make IsProgressNotify() false on compact event and closed channel
Fixes #6549
2016-09-28 16:49:39 -07:00
0ca0260c89 Merge pull request #6531 from sinsharat/glossary_update
Documentation/learning: Glossary update
2016-09-28 11:39:02 -07:00
2353cbca71 Merge pull request #6544 from gyuho/page-offset
wal, ioutil: set page offset for encoder
2016-09-28 11:37:24 -07:00
f5588526cc wal: set PageWriter offset in file encoder 2016-09-28 11:03:24 -07:00
d0c29cc610 pkg/ioutil: configure pageOffset in NewPageWriter 2016-09-28 09:45:54 -07:00
0b8b40ccca Merge pull request #6545 from gyuho/grammar
wal: fix minor wording in comment
2016-09-28 09:34:43 -07:00
231530e0c5 wal: fix minor wording in comment 2016-09-28 09:12:31 -07:00
ea0c65797a etcdserver: use linearizableReadNotify for txn 2016-09-28 20:47:49 +08:00
6c2414ebd1 Documentation/learning: Glossary update 2016-09-28 11:18:47 +05:30
f4ec303d1b wal/wal.go: modified WAL.SaveSnapshot to do the Marshal before aquiring the mutex 2016-09-28 10:35:19 +05:30
1e1dd24d05 Merge pull request #6536 from sinsharat/etcdctlv3_readme_update
etcdctlv3: minor updates to put and make-mirror command
2016-09-27 23:58:42 -05:00
dcfbcb7a68 etcdctlv3: minor updates to put and make-mirror command 2016-09-28 10:20:08 +05:30
3807faeddf Merge pull request #6541 from hhkbp2/improve-test-coverage
raft: add test cases to improve test coverage
2016-09-27 23:24:52 -05:00
eee23eaf43 Merge pull request #6540 from fanminshi/lease_panic_fix
etcdserver: fix a node panic bug caused LeaseTimeToLive call on a nonexistent lease
2016-09-27 23:17:16 -05:00
7d48855630 functional-tester: decouple failures from tester
This commit adds a new option --failures to etcd-tester. The option
receives a comma-delimited argument like this:
"default,failpoints". The given arguments are interpreted as names of
failures and they are injected to an etcd cluster. Available failures
are default (default scenario in etcd-tester) and failpoints. If no
args are passed to the option (--failures=""), no failures are
injected during testing.
2016-09-28 11:30:53 +09:00
a6eb2939b1 raft: add test cases to improve test coverage 2016-09-28 10:19:30 +08:00
8ef6687018 etcdserver: fix a node panic bug caused LeaseTimeToLive call on a nonexistent lease
When the non Leader etcd server receives a LeaseTimeToLive on a nonexistent lease, it responds with a nil resp and a nil error The invoking function parses the nil resp and results a segmentation fault.
I fix the bug by making sure the lease not found error is returned so that the invoking function parses the the error message instead.

fix #6537
2016-09-27 17:46:30 -07:00
e68cd086ee Merge pull request #6532 from heyitsanthony/no-gopath-build
build: support building out of path when GOPATH is not set
2016-09-27 13:25:04 -07:00
1e3a71d098 build: support building out of path when GOPATH is not set
Otherwise gets "go: GOPATH entry is relative; must be absolute path: ""."
2016-09-27 10:20:52 -07:00
150576fa72 Merge pull request #6212 from xiang90/readindex
etcdserver: initial read index implementation
2016-09-27 11:51:08 -05:00
e5f4fb1a79 Merge pull request #6527 from sinsharat/intracting_v3
etcdctlv3: interactive_v3 compaction and timetolive command update
2016-09-27 06:14:18 -05:00
ab20187f93 etcdctlv3: interactive_v3 compaction and timetolive command update 2016-09-27 16:19:55 +05:30
6167a2aaa7 Merge pull request #6524 from sinsharat/intracting_v3
etcdctlv3: interactive_v3 watch command update
2016-09-27 01:35:04 -05:00
e3e3993022 etcdserver: support read index
Use read index to achieve l-read.
2016-09-27 13:41:40 +08:00
7d9355ffba Merge pull request #6523 from vimalk78/correct-compactor-logger-package
compactor/compactor.go : corrected the capnslog package name
2016-09-26 14:21:24 -07:00
cd1306f866 etcdctlv3: interactive_v3 watch command update 2016-09-27 00:00:05 +05:30
e1550bae61 compactor/compactor.go: corrected the capnslog package name 2016-09-26 23:52:48 +05:30
e1efdd591e Merge pull request #6521 from sinsharat/intracting_v3
etcdctlv3: interactive_v3 del command update
2016-09-26 10:48:09 -05:00
83f2fa7adc etcdctlv3: interactive_v3 del command update 2016-09-26 19:56:20 +05:30
e2d51961dd Merge pull request #6520 from sinsharat/intracting_v3
etcdctlv3: interactive_v3 get command update
2016-09-26 07:20:44 -05:00
213e8a5b15 Merge pull request #6514 from gyuho/sort
test: grep versions with --sort
2016-09-26 06:31:05 -05:00
595743651b etcdctlv3: interactive_v3 get command update 2016-09-26 16:28:29 +05:30
06546cf100 Merge pull request #6517 from sinsharat/intracting_v3
etcdctlv3: interactive_v3 version and put command update
2016-09-26 03:53:13 -05:00
7a95831018 etcdctlv3: interactive_v3 version and put command update 2016-09-26 12:32:08 +05:30
cf83de6488 Merge pull request #6510 from sinsharat/etcdctlv3_readme_final_draft
etcdctlv3:corrected and organised etcdctl commands
2016-09-25 19:19:06 -05:00
f5b9238a3c Merge pull request #6516 from gyuho/vvv
vendor: remove unused code
2016-09-23 18:54:05 -07:00
f957c401d3 vendor: remove unused code 2016-09-23 16:57:28 -07:00
20211ed6bf test: grep versions with --sort 2016-09-23 15:49:20 -07:00
cf09562e40 Merge pull request #6512 from gyuho/dep
vendor: update 'google/btree'
2016-09-23 13:21:52 -07:00
ecb577d40c vendor: update 'google/btree' 2016-09-23 12:54:25 -07:00
15d268709e version: bump to v3.1.0-alpha.1+git 2016-09-23 11:32:39 -07:00
2469a95685 version: bump to v3.1.0-alpha.1 2016-09-23 11:19:22 -07:00
4ef44d1130 Merge pull request #6506 from mitake/decouple-stresser
functional-tester: decouple stresser from tester
2016-09-23 10:05:03 -07:00
044e5cf3a9 Merge pull request #6498 from ychen11/ychen11/etcdserverpb
Added more lines of comments into rpc.proto
2016-09-23 10:04:24 -07:00
0e493c11c2 functional-tester: decouple stresser from tester
This commit decouples stresser from the tester of
functional-tester. For doing it, this commit adds a new option
--stresser to etcd-tester. The option accepts two types of stresser:
"default" and "nop". If the option is "default", etcd-tester stresses
its etcd cluster with the existing stresser. If the option is "nop",
etcd-tester does nothing for stressing.

Partially fixes https://github.com/coreos/etcd/issues/6446
2016-09-24 01:04:57 +09:00
69f5b4ba79 Documentation:made watch request doc more clear 2016-09-23 23:13:55 +08:00
af8728f328 etcdctlv3:corrected and organised etcdctl commands 2016-09-23 18:21:54 +05:30
51aa220449 Merge pull request #6507 from sinsharat/readme_del_cmd_options_example_update
etcdctlv3 : added options and examples for del from-key
2016-09-22 17:08:50 -05:00
308038e96a etcdctlv3 : added options and examples for del from-key 2016-09-22 22:54:20 +05:30
b1e4defc48 Merge pull request #6501 from sinsharat/feature_add_del_from-key
etcdctlv3: del command from-key feature added
2016-09-22 09:15:04 -05:00
804e215981 Merge pull request #6505 from sinsharat/compaction_options_update
etcdctlv3: updated compaction options
2016-09-22 06:52:43 -07:00
5fa233a564 etcdctlv3: updated compaction options 2016-09-22 19:06:05 +05:30
35ff70656b etcdctlv3: del command from-key feature added 2016-09-22 16:55:36 +05:30
ea97aa3f0f Merge pull request #6504 from sinsharat/member_command_options_update
etcdctlv3: updated member command options
2016-09-22 03:47:49 -07:00
1601ee761a etcdctlv3: updated member command options 2016-09-22 15:04:54 +05:30
4de39d3683 Merge pull request #6502 from xiang90/etcdctl_mirror
etcdctl: remove the use of remprefix
2016-09-22 01:05:55 -05:00
30b26f8f50 etcdctl: remove the use of remprefix 2016-09-22 08:43:31 +08:00
3453ce55e3 Merge pull request #6496 from sinsharat/refactor_mirror_command_tests
e2e: refactored ctlv3_make_mirror_test
2016-09-21 19:33:42 -05:00
4ec0fce109 Merge pull request #6493 from gyuho/tester-build
functional-tester: build from repo root, vendor
2016-09-21 16:57:34 -07:00
27c500d8d0 Merge pull request #6487 from heyitsanthony/watch-stress
clientv3: process closed watcherStreams in watcherGrpcStream run loop
2016-09-21 13:55:25 -07:00
3f7f6fb557 Merge pull request #6500 from sinsharat/readme_del_option_update
etcdctlv3: updated del command options
2016-09-21 13:54:18 -07:00
a32518006c clientv3: process closed watcherStreams in watcherGrpcStream run loop
Was racing with Watch() when closing the grpc stream on no watchers.

Fixes #6476
2016-09-21 13:28:00 -07:00
bcda9af15d etcdctlv3: updated del command options 2016-09-22 00:16:53 +05:30
d743b8b866 Merge pull request #6474 from gyuho/auto-sync
clientv3: add 'Sync' method
2016-09-21 10:57:10 -07:00
deef16b376 integration: test client watchers with overlapped context cancels 2016-09-21 09:40:24 -07:00
592538986d e2e: refactored ctlv3_make_mirror_test 2016-09-21 22:07:03 +05:30
cdb1e34799 clientv3: add 'Sync' method 2016-09-21 09:10:25 -07:00
c016325647 Merge pull request #6495 from vimalk78/wal-improve-coverage-add-testcase-save-with-cut
wal/wal.go : improved coverage by testing WAL.Save which causes a WAL…
2016-09-21 11:04:21 -05:00
4426e282d6 Merge pull request #6497 from gyuho/raft-example
raftexample: remove snapshot TODO in README
2016-09-21 08:44:04 -07:00
3492753edf e2e: refactored ctlv3_make_mirror_test 2016-09-21 20:01:24 +05:30
113b27229b raftexample: remove snapshot TODO in README 2016-09-21 05:07:04 -07:00
13e7172b4b Merge pull request #6244 from gyuho/raft-example
raftexample: implement Raft snapshot
2016-09-21 04:55:29 -07:00
e4fbf7db00 raftexample: implement Raft snapshot 2016-09-21 04:23:05 -07:00
4b83f40618 raftexample: add index fields to filter entries 2016-09-21 04:23:05 -07:00
666d555450 raftexample: add snapshotter, handle Ready in raft 2016-09-21 04:23:05 -07:00
15fa8dd866 raftexample: add snapshot methods to kvstore 2016-09-21 04:23:01 -07:00
064411b51c wal/wal.go : improved coverage by testing WAL.Save which causes a WAL.cut to happen 2016-09-21 16:50:55 +05:30
d3906e75bf Merge pull request #6494 from sinsharat/update_snapshot_restore_options
etcdctlv3: updated snapshot restore options
2016-09-21 05:50:34 -05:00
05175480b3 etcdctlv3: updated snapshot restore options 2016-09-21 16:17:32 +05:30
0604fccfea Merge pull request #6492 from sinsharat/make-mirror_no_dest_test
etcdctlv3: test case: make-mirror no dest prefix
2016-09-21 03:12:01 -07:00
cff06ef64d Merge pull request #6491 from gyuho/functional
functional-tester: use different ports in Procfile
2016-09-21 02:54:54 -07:00
409fc439d1 etcdctlv3: test case: make-mirror no dest prefix 2016-09-21 15:12:36 +05:30
b2c4992a82 functional-tester: use different ports in Procfile 2016-09-21 02:39:45 -07:00
e8adc24c32 functional-tester: build from repo root, vendor 2016-09-21 02:06:13 -07:00
d6a3ce17d5 Merge pull request #6472 from sinsharat/make-mirror_modify_dest_test
etcdctlv3: test case: make-mirror modify dest prefix
2016-09-21 00:43:56 -07:00
e5ff5d92e6 etcdctlv3: test case: make-mirror modify dest prefix 2016-09-21 05:40:52 +05:30
b91d8625c8 Merge pull request #6485 from sinsharat/readme_get_features_update
ctlv3: updated readme for options and examples for get command
2016-09-21 07:26:46 +08:00
9743ee8b83 etcdctlv3: updated readme for options and examples for get command 2016-09-21 04:51:13 +05:30
095cff4415 Merge pull request #6478 from heyitsanthony/untangle-check
etcd-tester: split out consistency checking code from tester
2016-09-20 10:56:17 -07:00
d4eff5381c etcd-tester: split out consistency checking code from tester 2016-09-20 10:26:58 -07:00
3da8c6512b Merge pull request #6481 from sinsharat/update_timetolive_options
etcdctlv3: updated options for TIMETOLIVE
2016-09-20 23:29:15 +08:00
3e67702d4b etcdctlv3: updated options for TIMETOLIVE 2016-09-20 16:40:58 +05:30
b586060812 Merge pull request #6475 from fanminshi/leaseparallel
etcdserver: parallelize expired leases process
2016-09-19 16:46:31 -07:00
690a0b6f00 etcdserver: parallelize expired leases process
When 1000 leases expired at the same time, etcd takes more than 5 seconds to clean them. This means that even after the leases have expired, keys associated with leases are still accessible. I increase the deletion throughput by parallelizing leases deletion process.
2016-09-19 16:17:49 -07:00
69c7ea0b4a Merge pull request #6473 from heyitsanthony/watchreconn-putretry
integration: l-read before Put in TestWatchReconnRequest
2016-09-19 14:52:26 -07:00
0fb2cab221 integration: l-read before Put in TestWatchReconnRequest
TestWatchReconnRequest occasionally triggers elections because it spins on
drop connections, eating up CPU. In case there's an election, submit an
l-read to wait for the cluster to settle down.

Fixes #6314
2016-09-19 14:14:32 -07:00
c9e06fa1ed Merge pull request #6330 from gyuho/balancer-sync
clientv3: add SetEndpoints method
2016-09-20 04:52:13 +09:00
d26cfdb7d1 Merge pull request #6425 from heyitsanthony/etcdserver-wg
etcdserver: tighten up goroutine management
2016-09-19 12:51:16 -07:00
f11b35eb71 clientv3/integration: test 'SetEndpoints' 2016-09-20 04:36:14 +09:00
b9d18d4ac9 clientv3: add 'SetEndpoints' method 2016-09-20 04:36:01 +09:00
3866e78c26 etcdserver: tighten up goroutine management
All outstanding goroutines now go into the etcdserver waitgroup. goroutines are
shutdown with a "stopping" channel which is closed when the run() goroutine
shutsdown. The done channel will only close once the waitgroup is totally cleared.
2016-09-19 12:10:41 -07:00
a70513621c Merge pull request #6470 from xiang90/fix_doc
doc: use 2379 as port of the first member in local cluster
2016-09-19 08:34:11 -05:00
328c42f1b7 doc: use 2379 as port of the first member in local cluster 2016-09-19 21:28:33 +08:00
2dc06787ae Merge pull request #6467 from coreos/revert-6465-tls-copy
Revert "pkg/transport: update tls.Config copy method"
2016-09-19 16:02:41 +09:00
629d9e7dab Revert "pkg/transport: update tls.Config copy method" 2016-09-19 15:07:12 +09:00
db9ed233dc Merge pull request #6465 from gyuho/tls-copy
pkg/transport: update tls.Config copy method
2016-09-19 00:46:08 +09:00
8c9a88c7d4 pkg/transport: update tls.Config copy method
For Go 1.7
2016-09-18 22:50:45 +09:00
33dbf5c6bd Merge pull request #6463 from xiang90/fix_http
embed: fix go 1.7 http issue
2016-09-18 08:44:04 -05:00
7a48ca4cea embed: fix go 1.7 http issue
go 1.7 introduces HTTP2 compability issue. Now we
need to explicitly enable HTTP2 when TLS is set.
2016-09-18 18:38:55 +08:00
ac2077559d Merge pull request #6461 from gyuho/travis
travis: test with Go 1.7.1
2016-09-17 22:10:09 +09:00
63d6a4e0e1 travis: test with Go 1.7.1 2016-09-17 20:57:28 +09:00
4a7c1da9b3 Merge pull request #6460 from sinsharat/readme_update
etcdctlv3: updated readme for make-mirror: modify/remove prefix in dest cluster
2016-09-17 19:57:15 +09:00
6c408eb779 etcdctlv3:updated readme.md for make-mirror modify/remove prefix in dest cluster 2016-09-17 16:13:01 +05:30
86aeeca644 Merge pull request #6454 from sinsharat/windows_save_snapshot_fix
ctlv3: close snapshot file before rename (Windows)
2016-09-16 18:09:59 -05:00
0d65061a2d Merge pull request #6439 from sinsharat/make_mirror_feature_add
etcdctl/ctlv3: make-mirror: feature add to modify/remove prefix in dest cluster
2016-09-16 18:07:20 -05:00
01a0db0fce Merge pull request #6456 from heyitsanthony/version-bump-git
version: bump to 3.1.0-alpha.0+git
2016-09-16 15:12:30 -07:00
0a8bf60a9d version: bump to 3.1.0-alpha.0+git 2016-09-16 09:56:29 -07:00
fef6557f6c ctlv3: close snapshot file before rename (Windows) 2016-09-16 21:55:04 +05:30
b571f4d627 etcdctl/ctlv3: feature added to modify/remove prefix in the destination cluster 2016-09-16 18:48:41 +05:30
5c2053109b Merge pull request #6449 from gyuho/supported-stream
rafthttp: add v3.x to supported streams
2016-09-16 21:47:20 +09:00
8827619f5b rafthttp: add v3.x to supported streams 2016-09-16 20:49:00 +09:00
143e2f27fc Merge pull request #6447 from xiang90/cap
api: update capability map
2016-09-16 02:35:26 -05:00
d6904ce415 Merge pull request #6441 from petermattis/pmattis/tick-quiesced
raft: add RawNode.TickQuiesced
2016-09-16 01:48:21 -05:00
c6feb695dc api: update capability map 2016-09-16 14:34:55 +08:00
37fa6ac45c raft: add RawNode.TickQuiesced
TickQuiesced allows the caller to support "quiesced" Raft groups which
do not perform periodic heartbeats and elections. This is useful in a
system with thousands of Raft groups where these periodic operations can
be overwhelming in an otherwise idle system.

It might seem possible to avoid advancing the logical clock at all in
such Raft groups, but doing so has an interaction with the CheckQuorum
functionality. If a follower is not quiesced while the leader is the
follower can call an election that will fail because the leader's lease
has not expired (electionElapsed < electionTimeout). The next time the
leader sends a heartbeat to this follower the follower will see that the
heartbeat is from a previous term and respond with a MsgAppResp. This in
turn will cause the leader to step down and become a follower even
though there isn't a leader in the group. By allowing the leader's
logical clock to advance via TickQuiesced, the leader won't reject the
election and there will be a smooth transfer of leadership to the
follower.
2016-09-15 21:05:18 -04:00
2724c3946e Merge pull request #6444 from heyitsanthony/version-bump-3.1
version: bump to 3.1.0-alpha.0
2016-09-15 15:24:59 -07:00
c658fa62c5 version: bump to 3.1.0-alpha.0 2016-09-15 15:13:51 -07:00
624eb609fa Merge pull request #6443 from gyuho/news
NEWS: add v3.0.8, v3.0.9
2016-09-16 07:09:42 +09:00
1b1e54a281 NEWS: add v3.0.8, v3.0.9 2016-09-16 07:05:31 +09:00
9913e0073c Merge pull request #6438 from gyuho/e2e-backends
e2e: rename 'backends' to 'processes'
2016-09-15 19:00:28 +09:00
7cd7b5d539 e2e: rename 'backends' to 'processes' 2016-09-15 18:30:08 +09:00
a12b317552 Merge pull request #6428 from gyuho/snapshot-test
e2e: test snapshot restore
2016-09-15 04:22:03 -05:00
bb337c87d0 e2e: test snapshot restore 2016-09-15 17:58:00 +09:00
fb760b4c53 Merge pull request #6403 from vimalk78/rafthttp-mertics-record-rw-failures
rafthttp/metrics.go:fixed TODO: record write/recv failures.
2016-09-15 02:46:20 -05:00
d814804fa1 Merge pull request #6437 from sinsharat/readme_update
etcdctl: readme.md display fix
2016-09-15 16:20:42 +09:00
cd3a7fb833 etcdctl: readme.md display fix 2016-09-15 12:23:56 +05:30
64e1a327ee rafthttp/metrics.go:fixed TODO: record write/recv failures. 2016-09-15 11:32:08 +05:30
b3a083d336 Merge pull request #6436 from LiamHaworth/bugfix/6433-support-for-charset-in-content-type-header
etcdserver, api, v2http, client: Added support for semicolons
2016-09-14 23:25:31 -05:00
5cfa9e2384 etcdserver, api, v2http, client: Added support for semicolons
Added support into the v2 API to fix an issue (6433) where if there is a semicolon
and fields after it the API would return an "invalid Content-type" message even
if the content type was actually correct
2016-09-15 13:54:22 +10:00
e77baa3dcb Merge pull request #6424 from heyitsanthony/v3api-createminmax
etcdserver: range queries with min/max create revision
2016-09-14 19:10:52 -07:00
059f419ac5 Merge pull request #6429 from xiang90/fix_balancer
clientv3: balancer panics when call up after close
2016-09-14 19:42:24 -05:00
82af0c4a7d ctlv3: remove superfluous session creation 2016-09-14 17:03:33 -07:00
9b1fe45853 concurrency: use create max revision for locks and elections 2016-09-14 17:03:33 -07:00
004a5f0dbc clientv3: balancer panics when call up after close
Fix the issue by adding a simple guard varable.
2016-09-15 07:43:42 +08:00
aa7a35798d integration: add tests for MinCreateRev and MaxCreateRev 2016-09-14 15:31:45 -07:00
5bd251a6fa clientv3: WithMinCreateRev, WithMaxCreateRev 2016-09-14 15:31:45 -07:00
c0981a90f7 etcdserver, etcdserverpb: range min_create_revision and max_create_revision 2016-09-14 15:31:45 -07:00
c74ac99871 Merge pull request #6423 from heyitsanthony/fix-rwmutex
recipes: fix rwmutex locking
2016-09-14 09:50:26 -07:00
3730802fef Merge pull request #6427 from mitake/prefix-print
etcdctl: improve printing of role get for prefix permission
2016-09-14 02:27:28 -05:00
8eac9fb93d Merge pull request #6401 from hhkbp2/add-read-index-for-raft-rawnode
raft: add read index for RawNode
2016-09-14 02:14:49 -05:00
4211c0b7af etcdctl, clientv3: improve printing of role get for prefix permission
This commit improves printing of role get command for prefix
permission. If a range permission corresponds to a prefix permission,
it is explicitly printed for a user. Below is an example of the new
printing:

$ ETCDCTL_API=3 bin/etcdctl --user root:p role get r1
Role r1
KV Read:
        [/dir/, /dir0) (prefix /dir/)
        [k1, k5)
KV Write:
        [/dir/, /dir0) (prefix /dir/)
        [k1, k5)
2016-09-14 16:10:32 +09:00
eeca614cd3 raft: add read index for RawNode 2016-09-14 14:43:46 +08:00
672472f85e Merge pull request #6414 from mitake/prefix-perm
etcdctl: an option for granting permission with key prefix
2016-09-13 23:29:40 -05:00
4e2b09a7ca etcdctl: an option for granting permission with key prefix
This commit adds a new option --prefix to "role grant-permission"
command. If the option is passed, the command interprets the key as a
prefix of range permission.

Example of usage:
$ ETCDCTL_API=3 bin/etcdctl --user root:p role grant-permission --prefix r1 readwrite /dir/
Role r1 updated
$ ETCDCTL_API=3 bin/etcdctl --user root:p role get r1
Role r1
KV Read:
        [/dir/, /dir0)
        [k1, k5)
KV Write:
        [/dir/, /dir0)
        [k1, k5)
$ ETCDCTL_API=3 bin/etcdctl --user u1:p put /dir/key val
OK
2016-09-14 12:54:14 +09:00
c350cd7679 Merge pull request #6417 from xiang90/fix_TestPipelineExceedMaximumServing
rafthttp: fix TestPipelineExceedMaximumServing
2016-09-13 17:59:43 -05:00
9b91e96510 integration: fix rwmutex test to check write locking 2016-09-13 14:09:59 -07:00
9f829fdab7 recipes: fix rwmutex so locking works
Fixes #6408
2016-09-13 14:09:59 -07:00
c6bfdb909b Merge pull request #6412 from heyitsanthony/revert-domain-listener
embed: warn on domain name in listener
2016-09-13 10:25:18 -07:00
afef9cc312 Merge pull request #6418 from sinsharat/update_readme
etcdctl\ctlv3: updated readme.md for timetolive example
2016-09-14 02:06:57 +09:00
6f4e3696d2 etcdctl\ctlv3: updated readme.md for timetolive example 2016-09-13 22:31:34 +05:30
c7212b438d embed: warn on domain name in listener 2016-09-13 09:17:40 -07:00
0d35ba9b94 rafthttp: fix TestPipelineExceedMaximumServing
The timeout is too short. It might take more than 10ms to send
request over a blocking chan (buffer is full). Changing the timeout
to 1 second can fix this issue.
2016-09-13 19:06:11 +08:00
e6a7f25065 Merge pull request #6411 from heyitsanthony/v3api-minmaxmod
etcdserver: Range with min/max mod revision
2016-09-13 05:54:58 -05:00
cfe717e926 Merge pull request #6275 from xiang90/raft_l
raft: support safe readonly request
2016-09-13 01:36:04 -05:00
8c492c70ef Merge pull request #6413 from xiang90/fix_wait
clientv3: return error from response when possible
2016-09-12 22:54:42 -05:00
56084a7cc8 clientv3: return error from response when possible 2016-09-13 11:18:21 +08:00
fa2e9c2449 Revert "Merge pull request #6365 from heyitsanthony/fix-dns-bind"
This reverts commit af5ab7b351, reversing
changes made to da6a0f0594.
2016-09-12 19:45:35 -07:00
17e7f83212 integration: test MinModRev/MaxModRev 2016-09-12 19:44:14 -07:00
b0481ba858 clientv3: WithMinModRev and WithMaxModRev 2016-09-12 19:44:14 -07:00
3df8838501 Merge pull request #6404 from glycerine/range_fixes
etcd/auth: fix range handling bugs.
2016-09-12 21:26:59 -05:00
af0264d2e6 etcdserver, etcdserverpb: add MinModRevision and MaxModRevision options to Range 2016-09-12 15:17:57 -07:00
ce01fb3cdf Merge pull request #6410 from fanminshi/master
etcd-tester: fix peer-port parsing bug with localhost url
2016-09-12 14:00:06 -07:00
8a63071463 etcd-tester: fix peer-port parsing bug with localhost url
The following format "http://localhost:1234" causes existing port parser to fail. Add new logic to parse the host name first then extract port.

Fixes #6409
2016-09-12 13:29:52 -07:00
ef1ef0ba16 auth: fix range handling bugs.
Test 15, counting from zero, in TestGetMergedPerms
in etcd/auth/range_perm_cache_test.go, was trying
incorrectly assert that [a, b) merged with [b, "")
should be [a, b). Added a test specifically for
this. This patch fixes the incorrect larger test
and the bugs in the code that it was hiding.

Fixes #6359
2016-09-12 09:23:19 -05:00
710b14ce56 raft: support safe readonly request
Implement raft readonly request described in raft thesis 6.4
along with the existing clock/lease based approach.
2016-09-12 15:13:52 +08:00
840f4d48c8 Merge pull request #6402 from gyuho/logger
*: separate 'capnslog' log level setting
2016-09-10 21:38:53 -05:00
bfb9d837d9 Merge pull request #6399 from AdoHe/master
update language bindings doc to add coreos/jetcd
2016-09-10 21:55:41 +09:00
caaa8a48aa libraries-and-tools.md: add Java client 2016-09-10 20:47:31 +08:00
03b9d6f24c *: separate 'capnslog' log level setting 2016-09-10 20:26:51 +09:00
9a67d71e6c Merge pull request #6396 from heyitsanthony/rafthttp-msg-leak
rafthttp: log stream stopped message before closing channel
2016-09-09 17:52:03 -05:00
8f47468a40 Merge pull request #6397 from fanminshi/master
functional-tester: correct goreman command in readme
2016-09-09 17:30:54 -05:00
a571655983 functional-tester: correct goreman command in readme
update readme file to have the correct goreman command to start the functional tester locally.
2016-09-09 14:56:23 -07:00
0250f0c984 rafthttp: log stream stopped message before closing channel
Was causing spurious goroutine leak failures in testing.
2016-09-09 12:47:06 -07:00
92f141d670 Merge pull request #6393 from sinsharat/readme_update
etcdctl:readme.md doc made uniform
2016-09-09 12:04:48 -07:00
d5edb62bd0 etcdctl:readme.md doc made uniform 2016-09-10 00:32:36 +05:30
b22b405465 Merge pull request #6390 from gyuho/simple
wal: simplify dir.Close call
2016-09-09 09:50:38 +09:00
20fc9dc463 Merge pull request #6389 from heyitsanthony/func-tester-noroot
functional-tester: run locally
2016-09-08 19:48:33 -05:00
ccb46d2024 wal: simplify dir.Close call 2016-09-09 09:23:55 +09:00
0b675845f6 Merge pull request #6321 from gyuho/lease-information
*: lease timetolive
2016-09-09 08:43:28 +09:00
aa6b1e6a10 functional-tester: add Procfile 2016-09-08 16:35:55 -07:00
b7dc6cc604 e2e: test 'lease timetolive' 2016-09-09 08:22:41 +09:00
04a4cea630 etcdctl/ctlv3: add 'lease timetolive' command 2016-09-09 08:21:58 +09:00
4c08f6767c clientv3: add lease.TimeToLive + tests 2016-09-09 08:18:45 +09:00
55ba3d95fb etcd-tester: support per-agent client/peer/failpoint ports 2016-09-08 16:15:18 -07:00
78cfc8db95 grpcproxy: implement 'LeaseTimeToLive' 2016-09-09 08:14:46 +09:00
63b0cd470d etcdserver: implement 'LeaseTimeToLive' 2016-09-09 08:14:14 +09:00
0712ebc9b5 v2http: handle '/leases/internal' 2016-09-09 08:12:31 +09:00
2e25a772a5 etcd-agent: support rootless operation and configurable gofail ports 2016-09-08 16:12:00 -07:00
617d2d5b98 lease/*: add lease handler for 'LeaseTimeToLive' 2016-09-09 08:11:46 +09:00
3132e36bf3 etcdserverpb: add 'LeaseTimeToLive' RPC 2016-09-09 08:08:14 +09:00
33b3fdc627 Merge pull request #6388 from groxxda/patch-1
etcd.service: order after network.target
2016-09-08 16:31:29 -05:00
758f0d9017 Merge pull request #6387 from sinsharat/fix_ctl_win
ctlv3: fix line parsing for Windows
2016-09-08 16:27:26 -05:00
17377f5642 example .service file: Order after network.target
From (systemd NetworkTarget description)[https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/]:
```
[...]since the shutdown ordering of units in systemd is the reverse of the startup ordering, any unit that is order After=network.target can be sure that it is stopped before the network is shut down if the system is powered off. This allows services to cleanly terminate connections before going down, instead of abruptly losing connectivity for ongoing connections, leaving them in an undefined state.[...]
```
2016-09-08 23:11:01 +02:00
8b764aac71 ctlv3: fix line parsing for Windows 2016-09-09 01:58:33 +05:30
bb3ba1ee1c Merge pull request #6381 from heyitsanthony/fix-wal-rename
wal: fsync directory after wal file rename
2016-09-08 12:56:50 -07:00
28d80ad709 Merge pull request #6370 from xiang90/fix_restore
etcdctl: restore should create a snapshot
2016-09-08 14:25:07 -05:00
e9f841627c Merge pull request #6384 from hhkbp2/add-test-case-for-leader-transfer-from-follower
raft: add test case for leader transfer from follower
2016-09-08 13:58:03 -05:00
4563efd766 Merge pull request #6382 from heyitsanthony/unhealthy-err
v3api, rpctypes: add ErrUnhealthy
2016-09-08 09:15:58 -07:00
68f2fdc1ff raft: add test case for leader transfer from follower 2016-09-08 17:22:52 +08:00
bd7107bd4b wal: fsync directory after wal file rename
Fixes #6368
2016-09-08 00:09:16 -07:00
c449da6ff9 fileutil: windows OpenDir
Windows needs to open a directory with write access to fsync but the go
runtime won't open directories that way.
2016-09-08 00:09:16 -07:00
0cc2f82e7e Merge pull request #6383 from gyuho/lease-client
clientv3: use correct context in toErr (lease)
2016-09-08 01:39:40 -05:00
1aec483e42 clientv3: use correct context in toErr (lease) 2016-09-08 10:58:11 +09:00
1defeda792 v3api, rpctypes: add ErrUnhealthy 2016-09-07 16:51:49 -07:00
0b6350227c Merge pull request #6341 from xiang90/handle_overload
grpcproxy: handle overloaded stream
2016-09-07 16:55:41 -05:00
656167d760 etcdctl: Corrected command in Readme.md (#6376)
Corrected command in Readme.md
2016-09-07 21:09:24 +09:00
a6c905ad96 Merge pull request #6367 from heyitsanthony/fix-watch-init-reconn
clientv3: drain buffered WatchResponses before resuming
2016-09-07 03:15:01 -05:00
f411583ed1 Merge pull request #6374 from sinsharat/master
etcdctlv3: Readme.md updated
2016-09-07 02:29:14 -05:00
534cb0b749 etcdctlv3: Readme.md updated
1. Under PUT example the put command was mentioned in capital which will
give the below error:
Error: unknown command "PUT" for "etcdctl"
Hence corrected the same.
2. The lease id is mentioned with 0x to denote hex but since its an
example, copy pasting the command will give the below error:
Error: bad lease ID (strconv.ParseInt: parsing "0x1234abcd": invalid
syntax), expecting ID in Hex
Hence modified the same to a sample correct value so that a user new to
etcd does not get confused.
3. The command ./etcdctl range foo does not work and gives the below
error:
Error: unknown command "range" for "etcdctl"
Hence corrected the same

#6372
2016-09-07 12:35:20 +05:30
7b7b29ad1e Merge pull request #6373 from vimalk78/master
pkg/pbutil: corrected the package name in logger in pbutil.go
2016-09-07 01:35:21 -05:00
5ea6990a73 corrected the package name in logger 2016-09-07 11:52:01 +05:30
ce49fb6ec4 raft: add tests for IsLocalMsg (#6357)
* raft: add tests for IsLocalMsg

* report index of failed tests
2016-09-07 12:52:37 +09:00
7e182fa24a etcdctl: restore should create a snapshot
Restore should create a snasphot. So the new db file
can be sent to newly joined member.
2016-09-07 11:21:53 +08:00
b24527f2f0 Merge pull request #6353 from petermattis/pmattis/grow-inflights-buffer
raft: grow the inflights buffer instead of preallocating
2016-09-07 09:51:45 +09:00
ad318ee891 clientv3: drain buffered WatchResponses before resuming
Otherwise, the watcherStream can receive WatchResponses in the
middle of a resume, corrupting the stream.

Fixes #6364
2016-09-06 17:15:39 -07:00
af5ab7b351 Merge pull request #6365 from heyitsanthony/fix-dns-bind
embed: reject domain names before binding
2016-09-06 16:02:46 -07:00
7644a8ad76 integration: test domain name URLs are rejected before binding 2016-09-06 15:33:47 -07:00
2752169d6a embed: reject binding listeners to domain names
Fixes #6336
2016-09-06 15:33:28 -07:00
c1948f2940 raft: grow the inflights buffer instead of preallocating
Grow the inflights buffer as needed instead of preallocating it to its
max size. This avoids preallocating a lot of unnecessary
space (8*MaxInflightMsgs) when using lots of raft groups while still
allowing for a reasonable MaxInflightMsgs configuration.
2016-09-06 18:07:01 -04:00
da6a0f0594 Merge pull request #6362 from kevinburke/fix-typo
Documentation: fix typo
2016-09-06 14:45:03 -05:00
96ed856bca Merge pull request #6345 from topecongiro/patch-1
rafthttp: remove unnecessary sendc from peer
2016-09-06 11:32:16 -07:00
e508ce36ef Documentation: fix typo
"its" in this case is not short for "it is", it should be a possessive.
2016-09-06 11:26:27 -07:00
0b9c65c82f Merge pull request #6360 from jonboulle/master
scripts, doc: remove actool references
2016-09-06 18:37:28 +02:00
fd0539c8cc scripts, doc: remove actool references
Since c597d591b5 the release script uses
acbuild instead of actool, so purge all the references and have the
release script check for acbuild's presence instead.
2016-09-06 17:47:41 +02:00
d36c0a1444 Merge pull request #6356 from mitake/root-role
auth, e2e: the root role should be granted access to every key
2016-09-06 15:31:20 +08:00
bc5d7bbe03 auth, e2e, clientv3: the root role should be granted access to every key
This commit changes the semantics of the root role. The role should be
able to access to every key.

Partially fixes https://github.com/coreos/etcd/issues/6355
2016-09-06 16:10:28 +09:00
271df0dd71 Merge pull request #6354 from es-chow/fix-typo-in-interacting_v3-md
interacting_v3.md: fix typo
2016-09-06 10:17:56 +09:00
b17b482268 interacting_v3.md: fix typo 2016-09-06 09:08:37 +08:00
65fb1ad362 Merge pull request #6351 from petermattis/pmattis/raft-global-rand
raft: use a singleton global rand
2016-09-05 22:25:18 +08:00
4a33aa3917 raft: use a singleton global rand
rand.NewSource creates a 4872 byte object. With a small number of raft
groups in a process this isn't a problem. With 10k raft groups we'd use
46MB for these random sources. The only usage is in
raft.resetRandomizedElectionTimeout which isn't performance critical.

Fixes #6347.
2016-09-05 09:03:18 -04:00
1ebeef5cbf Merge pull request #6350 from nekto0n/fix_message_limit
rafthttp: fix misprint in readBytesLimit value
2016-09-05 15:26:19 +09:00
1b40fe7709 Merge pull request #6348 from plasticbox/master
libraries-and-tools.md: remove C++
2016-09-05 15:07:23 +09:00
da26e230a0 rafthttp: fix misprint in readBytesLimit value
and make test path in restricted test environments
2016-09-05 11:06:08 +05:00
f36267bf74 libraries-and-tools.md: remove C++ 2016-09-05 15:03:07 +09:00
a66b1e7c60 Merge pull request #6349 from gyuho/decode-length-limit
rafthttp: check decode size before buffer alloc
2016-09-05 14:25:23 +09:00
5c8ba23767 rafthttp: check decode size before buffer alloc
Fix https://github.com/coreos/etcd/issues/5386.
2016-09-05 14:06:03 +09:00
ec9e77db96 rafthttp: remove unnecessary sendc from peer 2016-09-04 13:07:31 +09:00
2e0dc8467d Merge pull request #6344 from glycerine/partial_fix_6343
etcdctl/ctlv3: don't crash when we should prompt for pw.
2016-09-03 10:55:38 -07:00
cccbf302f2 etcdctl/ctlv3: don't crash when we should prompt for pw.
when 'etcdctl --user name get blah' is invoked to
 prompt for password, don't panic.

 addresses the segfault part of #6343
2016-09-03 10:32:16 -07:00
56cfe40184 grpcproxy: fix a data race 2016-09-03 07:53:18 -07:00
b56ee178d5 grpcproxy: handle overloaded stream 2016-09-03 07:49:20 -07:00
0d07154926 Merge pull request #6340 from xiang90/fix_double_create
grpcproxy: fix double create event
2016-09-02 16:37:29 -07:00
81bd381048 Merge pull request #6339 from xiang90/close
grpcproxy: stop watchers in watch groups
2016-09-02 16:03:12 -07:00
805d4cbd93 grpcproxy: fix double create event 2016-09-02 16:02:46 -07:00
eded62e60c grpcproxy: stop watchers in watch groups 2016-09-02 16:01:11 -07:00
5b14b834c9 Merge pull request #6338 from xiang90/create
grpcproxy: fix more issues in watch path
2016-09-02 15:14:12 -07:00
8cd47c4348 grpcproxy: fix more issues in watch path 2016-09-02 15:13:21 -07:00
f7293125cf Merge pull request #6337 from xiang90/watch_cancel
grpcproxy: support cancel watcher
2016-09-02 13:38:20 -07:00
51b4d6b7a8 grpcproxy: support cancel watcher
We do not wait for the cancellation from actual etcd server,
but generate it at the proxy side. The rule is to return the
latest rev that the watcher has seen. This should be good
enough for most use cases if not all.
2016-09-02 12:36:47 -07:00
acc270edbf Merge pull request #6333 from plasticbox/master
libraries-and-tools.md: add C++ client package
2016-09-02 09:29:49 -07:00
ed2b3314b8 libraries-and-tools.md: add C++ client package 2016-09-02 14:05:49 +09:00
e93ee6179c Merge pull request #6325 from heyitsanthony/etcdctl-txn-quotes
etcdctl: fix quotes in txn and watch
2016-09-01 19:55:16 -07:00
666e7bd120 e2e: add quoted key/value to txn test 2016-09-01 19:39:23 -07:00
b1740f5fe4 etcdctl: fix quoted string handling in txn and watch
Fixes #6315
2016-09-01 19:39:23 -07:00
c59e0aa83e Merge pull request #6332 from heyitsanthony/fix-watcher-stream-cancel
grpcproxy: shutdown on client context cancel
2016-09-01 16:18:29 -07:00
7b2f769643 clientv3: only resume watcher if error is non-halting 2016-09-01 15:22:35 -07:00
3489fa82fb integration: don't nest proxies in cluster_proxy mode 2016-09-01 15:21:52 -07:00
d3ecebd14e grpcproxy: shut down watcher proxy when client context is done 2016-09-01 15:20:50 -07:00
26999db927 Merge pull request #6331 from xiang90/fix_proxy
grpcproxy: fix stream closing issue
2016-09-01 11:27:37 -07:00
9ef0f5ef8a grpcproxy: fix stream closing issue 2016-09-01 09:35:56 -07:00
9e5bccd458 Merge pull request #6324 from xiang90/fix_proxy_data_race
grpcproxy: fix data race
2016-08-31 18:48:51 -07:00
b982c80c14 grpcproxy: fix data race 2016-08-31 16:52:04 -07:00
48706a9cd6 Merge pull request #6320 from xiang90/fixTestIssue3699
integration: fix live lock in issue3699
2016-08-31 12:43:43 -07:00
5b60be9626 integration: fix live lock in issue3699
Do not restart the killed member immediately.
The member will advance its election timeout after restart
So it will have a better chance to become the leader again.
2016-08-31 12:25:24 -07:00
d016383740 Merge pull request #6319 from gyuho/news
NEWS: add v3.0.7
2016-08-31 11:22:09 -07:00
44e710f76c NEWS: add v3.0.7 2016-08-31 09:31:05 -07:00
a6d22b96c3 Merge pull request #6317 from gyuho/release-test
e2e: add 'TestReleaseUpgradeWithRestart'
2016-08-30 21:22:20 -07:00
2d552927e0 Merge pull request #6316 from gyuho/grpc-endpoints
e2e: remove stripSchema
2016-08-30 21:03:06 -07:00
a1598d767b e2e: add 'TestReleaseUpgradeWithRestart' 2016-08-30 21:01:10 -07:00
54ab9a1aba Merge pull request #6312 from gyuho/release-upgrade-test-v2
test: test with v3.0 (preparation for v3.1)
2016-08-30 20:57:18 -07:00
3aa2d1b40e test: test with v3.0 (preparation for v3.1) 2016-08-30 20:54:07 -07:00
c8ad147c0a e2e: remove stripSchema 2016-08-30 20:52:33 -07:00
e29c79c54c Merge pull request #6310 from heyitsanthony/wal-page-write
wal: use page buffered writer for writing records
2016-08-30 19:34:12 -07:00
28277b5a65 wal: use page buffered writer for writing records
Forces torn writes to only happen on sector boundaries.

Fixes #6271
2016-08-30 15:49:07 -07:00
2943bf9086 ioutil: add page buffered writer
A buffered writer that only writes full pages or when explicitly flushed.
2016-08-30 15:49:07 -07:00
48941cea95 Merge pull request #6308 from gyuho/manual2
client: do not send previous node data (optional)
2016-08-30 13:33:22 -07:00
ff7458508f Documentation/v2: add 'noValueOnSuccess' example 2016-08-30 11:49:12 -07:00
b9cd329c61 Merge pull request #6309 from xiang90/fix_upgrade
etcdserver: allow zero kv index for cluster upgrade
2016-08-30 11:46:14 -07:00
771ee43169 etcdserver: allow zero kv index for cluster upgrade
If a user upgrades etcd from 2.3.x to 3.0 and shutdown the
cluster immediately without triggering any new backend writes,
then the consistent index in backend would be zero.

The user cannot restart etcdserver due to today's strick index
match checking. We now have to lose this a bit for this case.
2016-08-30 11:28:18 -07:00
5c06fc9093 integration: change to 'NoValueOnSuccess' 2016-08-30 10:58:44 -07:00
2da7b63809 v2http: change to 'NoValueOnSuccess' 2016-08-30 10:53:02 -07:00
fb39e96862 client: change to 'NoValueOnSuccess' 2016-08-30 10:52:58 -07:00
572bfd99ff v2http: update function returns 2016-08-30 10:29:37 -07:00
82053f04b2 client: do not send previous node data (optional)
- Do not send back node data when specified
- remove node and prevNode when noDataOnSuccess is set
2016-08-30 10:04:09 -07:00
7873c25abd Merge pull request #6307 from gyuho/manual
libraries-and-tools.md: add C++ client package
2016-08-30 10:00:49 -07:00
e7314a2460 libraries-and-tools.md: add C++ client package 2016-08-30 09:51:27 -07:00
9e9bbb829e Merge pull request #6289 from purpleidea/feat/move-readynotify
embed: Move the ReadyNotify() call to a more sane place
2016-08-29 20:06:17 -07:00
547bf1a92d Merge pull request #6284 from glycerine/fix6278
fix unintended deadlock on key prefixes
2016-08-29 19:50:50 -07:00
9aee3f01cd embed: Move the ReadyNotify() call to a better place
When using the embed functionality, you can't call the Server.Stop()
function until StartEtcd returns, which can block until there is a call
to Server.Stop() in error situations. Since we have a catch-22, the
ReadyNotify() can be called manually by the user if they wish to wait
for the server startup, or in parallel with a timeout if they wish to
cancel it after some time.

Chzz pointed out that this is also more consistent with the
etcdserver.Start() behaviour too.

purpleidea pointed out that this is actually more correct too, because
we can now register the stop interrupt handler before we block on
startup.
2016-08-29 22:45:41 -04:00
9497e9678c clientv3/concurrency: allow election on prefixes of keys.
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes #6278
2016-08-29 18:34:14 -07:00
48f4a7d037 Merge pull request #6286 from bdarnell/initial-election-check-quorum
raft: Allow an election immediately after start with checkQuorum
2016-08-29 17:59:32 -07:00
a7a867c1e6 raft: Allow an election immediately after start with checkQuorum
Previously, the checkQuorum flag required an election timeout to
expire before a node could cast its first vote. This change permits
the node to cast a vote at any time when the leader is not known,
including immediately after startup.
2016-08-30 08:28:41 +08:00
f4c30425c0 Merge pull request #6298 from sinsharat/master
store: added missing test case scenerio for scan of de-queued entries
2016-08-29 13:55:55 -07:00
452dedf8ab Merge pull request #6297 from gyuho/grpc-proxy
grpcproxy: fix recursive Context method
2016-08-29 13:31:44 -07:00
f6cda8ac0b Merge pull request #6299 from sinsharat/master
store: removed duplicate method call for the same method
2016-08-29 13:27:57 -07:00
396fac416e Merge pull request #6273 from gyuho/get-cmd
ctlv3: add 'print-value-only' flag to get command
2016-08-29 13:25:30 -07:00
db7e38b0ed Merge pull request #6300 from sinsharat/master
wal: document grammar correction
2016-08-29 12:22:38 -07:00
69ed560fae wal: document grammar correction
Corrected grammar mistake for doc.go
2016-08-30 00:50:02 +05:30
754b9025c4 store: removed duplicate method call for the same method
the get func was calling path's Join and clean method which is already
being in internalGet(nodePath) func. Hence the func was getting called
unnecessarily twice which is not needed.

#6295
2016-08-30 00:44:53 +05:30
1c59708c51 e2e: test 'print-value-only' flag 2016-08-29 12:09:16 -07:00
524a5a1afb ctlv3: add 'print-value-only' flag to get command 2016-08-29 12:09:07 -07:00
45079ec6c1 Merge pull request #6274 from dghubble/etcd3-rkt-docs
Documentation: Add initial etcd3 with rkt docs
2016-08-29 12:01:27 -07:00
4f150b06e5 store: added missing test case scenerio for scan of de-queued entries
Test case added to check err handing for replaced entries.

#6255
2016-08-30 00:30:48 +05:30
fa79d42b98 Documentation: Add initial etcd3 with rkt docs 2016-08-29 11:59:46 -07:00
86bf2bc443 grpcproxy: fix recursive Context method 2016-08-29 11:37:35 -07:00
e53b99588a Merge pull request #6288 from heyitsanthony/fix-retryread
clientv3: retry non-mutable rpcs on Internal codes
2016-08-28 20:41:19 -07:00
5e963608b7 clientv3: do not treat Internal codes as halting
Fixes #6277
2016-08-28 20:20:22 -07:00
3552420dfd clientv3: set failfast=false on read-only txns 2016-08-28 19:40:38 -07:00
64ac631863 rpctypes: set unknown codes to Unknown instead of internal
An unrecognized error code isn't "very broken".
2016-08-28 19:37:35 -07:00
f73258a51f Merge pull request #6282 from gyuho/tester-error
etcd-tester: return error for mismatch rev/hash
2016-08-27 22:25:18 -07:00
0bf2ef3c1b etcd-tester: return error for mismatch rev/hash 2016-08-27 22:14:42 -07:00
a0759298c5 Merge pull request #6281 from xiang90/fix
etcd-tester: do not restart stresser on error
2016-08-27 20:49:08 -07:00
017aac88a8 etcd-tester: do not restart stresser on error 2016-08-27 20:47:45 -07:00
0be190df4d Merge pull request #6279 from xiang90/fix_hash
mvcc: force commit and hash should be atomic for getting hash
2016-08-27 20:09:22 -07:00
1437388f77 mvcc: force commit and hash should be atomic for getting hash 2016-08-27 19:22:22 -07:00
c388b2f22f Merge pull request #6264 from heyitsanthony/error-codes
clientv3: use grpc codes to translate raw grpc errors
2016-08-26 11:52:37 -07:00
a50c707050 clientv3/integration: wait for two request timeouts in txn tests
Read only txns and Get may timeout once if the leader is lost.
2016-08-26 10:04:10 -07:00
3a49cbb769 Merge pull request #6269 from aaronlehmann/hold-lock-while-renaming
On non-Windows OS, hold file lock while renaming WAL directory
2016-08-26 09:53:59 -07:00
af4f82228c wal: hold file lock while renaming WAL directory on non-Windows
Windows requires this lock to be released before the directory is
renamed. But on unix-like operating systems, releasing the lock and
trying to reacquire it immediately can be flaky if a process is forked
around the same time. The file descriptors are marked as close-on-exec
by the Go runtime, but there is a window between the fork and exec where
another process will be holding the lock.
2016-08-26 09:27:51 -07:00
df54ad2208 v3rpc, rpctypes: add error types for timeouts 2016-08-26 09:22:09 -07:00
267063efd0 clientv3: use grpc codes to translate raw grpc errors 2016-08-26 09:22:09 -07:00
417b9469aa Merge pull request #6270 from heyitsanthony/etcdserver-timeout
etcdserver: use request timeout defined by ServerConfig for v3 requests
2016-08-25 20:50:21 -07:00
254c0ea814 etcdserver: use request timeout defined by ServerConfig for v3 requests 2016-08-25 18:39:01 -07:00
4f5cacc835 Merge pull request #6267 from heyitsanthony/fix-wal-tear
wal: fix CRC corruption on writes following write tears
2016-08-25 17:10:08 -07:00
f1ead43482 wal: zero out wal tail past its first zero record
Whenever the WAL is opened for writes, it should write zeroes to its tail
starting from the first zero record. Otherwise, if there are entries past
the first zero record due to a torn write, any new writes that overlap the
old entries will lead to a garbage record on the tail and cause a CRC
mismatch.
2016-08-25 14:24:46 -07:00
58a36cb651 fileutil: add ZeroToEnd for zeroing files 2016-08-25 14:24:46 -07:00
0d8d9a374c wal: test for truncation on torn writes 2016-08-25 14:24:46 -07:00
488ae52a51 Merge pull request #6259 from xiang90/fix_test_c
clientv3/integration: fix TestKVPutStoppedServerAndClose
2016-08-24 14:14:17 -07:00
f2b7c501cc clientv3/integration: fix TestKVPutStoppedServerAndClose 2016-08-24 13:57:27 -07:00
bb110b0a2d Merge pull request #6257 from heyitsanthony/doc-fix-buglink
Documentation: update links for unaligned 64-bit atomics issue
2016-08-24 09:37:00 -07:00
159c8ee6e0 Documentation: update links for unaligned 64-bit atomics issue
Fixes #6256
2016-08-24 09:13:53 -07:00
1c989edb47 Merge pull request #6253 from heyitsanthony/srv-arec
discovery: reject IP address records in SRVGetCluster
2016-08-24 06:56:17 -07:00
3dc12e33f1 discovery: reject IP address records in SRVGetCluster
Was incorrectly trimming the trailing '.' from the target; this in turn
caused the etcd server to accept any SRV record with an IP target
instead of only targets with A records.
2016-08-23 18:10:42 -07:00
8e4fcaa6dc Merge pull request #6251 from xiang90/ctl_doc
etcdctl: list output options
2016-08-23 11:32:33 -07:00
86dcfbf205 etcdctl: list output options 2016-08-23 11:32:00 -07:00
83e66d2962 Merge pull request #6248 from xiang90/fix_mvcc
mvcc: only write txn should update index
2016-08-23 10:50:46 -07:00
c12104bd15 Merge pull request #6247 from xiang90/fix_snap
etcdserver: kv.commit needs to be serialized with apply
2016-08-23 09:39:54 -07:00
7f3d4bfae5 etcdserver: kv.commit needs to be serialized with apply
kv.commit updates the consistent index in backend. When
executing in parallel with apply, it might grab tx lock
after apply update the consistent index and before apply
starts to execute the opeartion. If the server dies right
after kv.commit, the consistent is updated but the opeartion
is not executed. If we restart etcd server, etcd will skip
the operation. :(

There are a few other places that we need to take care of,
but let us fix this first.
2016-08-23 09:16:09 -07:00
959f860a40 Merge pull request #6249 from gyuho/fix-count
etcd-tester: fix compact rev counting
2016-08-22 23:36:57 -07:00
0c37df7265 etcd-tester: fix compact rev counting 2016-08-22 22:58:44 -07:00
e1789aa531 mvcc: only write txn should update index 2016-08-22 22:05:51 -07:00
028b954052 Merge pull request #6245 from requenym/patch-1
documentation: update libraries-and-tools.md
2016-08-22 19:08:15 -07:00
49ef47a9a4 documentation: update libraries-and-tools.md 2016-08-22 20:21:29 -04:00
13f79affb6 Merge pull request #6243 from xiang90/fix_m
e2e: remove server testing in etcdctl test
2016-08-22 16:14:51 -07:00
aa89bc35fd Merge pull request #6242 from heyitsanthony/rwdial-timeout
pkg/transport: bump wait time in TestReadWriteTimeoutDialer for write deadline
2016-08-22 16:13:50 -07:00
722d66b03d Merge pull request #6241 from gyuho/progress-doc
clientv3: specify watch progress notify interval
2016-08-22 15:59:01 -07:00
be38c50567 clientv3: specify watch progress notify interval
For watch request
2016-08-22 15:44:59 -07:00
1d58c7d3b2 e2e: remove server testing in etcdctl test 2016-08-22 15:34:50 -07:00
3b92384394 pkg/transport: bump wait time in TestReadWriteTimeoutDialer for write deadline
Was able to get 2s wait times with 500 concurrent requests on a fast machine;
a slower machine could possibly see similar delays with a single connection.

Fixes #6220
2016-08-22 15:30:44 -07:00
c39b7205a6 Merge pull request #6228 from mitake/e2e-txn-auth
e2e: a test case for txn and permission
2016-08-22 09:24:18 -07:00
3d5d3b90e9 e2e: a test case for txn and permission
This commit adds a new test case for checking the permission mechanism
can work well in txn requests.
2016-08-22 12:06:19 +09:00
0504b277b6 Merge pull request #6235 from coreos/procfile-location
local_cluster: make it clear where Procfile is
2016-08-21 19:55:50 -07:00
4c7bced34e local_cluster: make it clear where Procfile is
It isn't clear where to start with these instructions, fix this.
2016-08-21 17:14:59 -04:00
8c88c1611e Merge pull request #6231 from heyitsanthony/fix-rafthttp-test
rafthttp: fix race in TestStreamWriterAttachOutgoingConn
2016-08-19 20:40:42 -07:00
784c4446d9 rafthttp: fix race in TestStreamWriterAttachOutgoingConn
Fixes #6230
2016-08-19 19:59:16 -07:00
262c98f327 Merge pull request #6229 from xiang90/applynotify
etcdserver: add waitApplyIndex
2016-08-19 16:58:21 -07:00
83de13e4a8 etcdserver: support apply wait 2016-08-19 16:18:35 -07:00
940402a27d Merge pull request #6225 from xiang90/cache
grpc-proxy: invalidate cache entries when there is a put/delete
2016-08-19 15:11:59 -07:00
8db4f5b8e1 pkg/wait: change wait time to use logical clock 2016-08-19 15:10:37 -07:00
146bce3377 Merge pull request #6211 from gyuho/proxy-timeout
integration: improve TestTransferLeader
2016-08-19 13:32:18 -07:00
eaa5d9772f integration: improve TestTransferLeader
so that it can check leader transition
2016-08-19 13:11:38 -07:00
c8bbb8c53e grpc-proxy: invalidate cache entries when there is a put/delete 2016-08-19 12:52:19 -07:00
5e6d2a23b7 Merge pull request #6226 from gyuho/vendor
vendor: update grpc/grpc-go for clientconn patch
2016-08-18 20:35:25 -07:00
01471481a9 vendor: update grpc/grpc-go for clientconn patch 2016-08-18 20:17:24 -07:00
f4b6ed2469 Merge pull request #6223 from heyitsanthony/fix-rafthttp-badoutgoing
rafthttp: remove WaitSchedule() from tests
2016-08-18 16:44:56 -07:00
da1e022890 rafthttp: remove WaitSchedule() from tests
Fixes #6187
2016-08-18 16:26:35 -07:00
5e9fe0dc23 Merge pull request #6222 from hongchaodeng/master
integration: NewClusterV3() should launch cluster before creating clients
2016-08-18 14:52:04 -07:00
5630a76766 integration: NewClusterV3 should launch cluster before creating clients 2016-08-18 14:05:21 -07:00
8021487b7a Merge pull request #6219 from sinsharat/master
raft: handled panic for Term due to IOB
2016-08-18 12:33:52 -07:00
a8fc4396e2 Merge pull request #6218 from gyuho/boltdb
vendor: boltdb/bolt v1.3.0 for Go 1.7
2016-08-18 11:06:05 -07:00
9b3b1f80dd raft: handled panic for Term due to IOB
Instead of raising panic, returning an error instead for better handling

#6215
2016-08-18 23:11:38 +05:30
00f5a01378 vendor: boltdb/bolt v1.3.0 for Go 1.7 2016-08-18 10:36:20 -07:00
cc4f4b47bc Merge pull request #6198 from heyitsanthony/reenable-outside-gopath
build: re-enable building outside gopath
2016-08-18 09:44:34 -07:00
a20d4a2d31 Merge pull request #6209 from heyitsanthony/fix-waittime-test
pkg/wait: don't expect time.Now() to be strict increasing in WaitTime tests
2016-08-17 13:46:11 -07:00
14f6dd4ded Merge pull request #6210 from gyuho/race
integration: fix race in TestDoubleBarrierFailover
2016-08-17 12:19:06 -07:00
10c9e238f0 integration: fix race in TestDoubleBarrierFailover 2016-08-17 11:56:49 -07:00
f9d122066e pkg/wait: don't expect time.Now() to be strict increasing in WaitTime tests 2016-08-17 11:53:34 -07:00
57fde954b9 Merge pull request #6208 from xiang90/better_logging
etcdserver: improve logging for leadership transfer
2016-08-17 11:47:38 -07:00
d0fa390048 etcdserver: improve logging for leadership transfer 2016-08-17 11:40:46 -07:00
5aa935f3b7 Merge pull request #6207 from gyuho/wait-extra
integration: write to leader group first, or wait
2016-08-17 11:25:09 -07:00
f2fedbae9b integration: write to leader group first, or wait
Write to leader group first, or give more time to
acknowledge the leader after network partition recovery
2016-08-17 11:09:33 -07:00
a5022c1cba Merge pull request #6205 from heyitsanthony/ft-large-writes
functional-tester: put large keys
2016-08-17 10:49:56 -07:00
e7a7fb2bb1 Merge pull request #6204 from gyuho/news
NEWS: add v3.0.5
2016-08-17 09:57:06 -07:00
6655afda4b NEWS: add v3.0.5 2016-08-17 09:56:45 -07:00
47b6449934 functional-tester: put large keys
For testing writes that must span multiple pages.
2016-08-17 09:51:44 -07:00
30cf8b7f0f Merge pull request #6197 from gyuho/mutex-proxies
integration: fix race in setting shared proxies
2016-08-17 09:15:10 -07:00
83dd121bae build: re-enable building outside gopath
Have build return an error code if build fails and add a test to travis
to confirm running build outside the gopath works.
2016-08-16 20:06:05 -07:00
38c370a7c5 Merge pull request #6196 from gyuho/clockwork
vendor: use v0.1.0 clockwork
2016-08-16 19:52:34 -07:00
fb00a32b86 integration: fix races in global proxies 2016-08-16 19:43:31 -07:00
f91f7dfb91 v2http: fix tests to use new clockwork 2016-08-16 16:36:24 -07:00
3f0f4bfee7 vendor: clockwork v0.1.0 2016-08-16 16:31:10 -07:00
28b797b538 Merge pull request #6194 from heyitsanthony/fix-gofail
build: don't override gopath by default, demote old gopath on override
2016-08-16 14:27:08 -07:00
cf063ed475 Merge pull request #6193 from xiang90/gw
docs: add gateway
2016-08-16 14:16:22 -07:00
b499f69181 docs: add gateway 2016-08-16 14:02:45 -07:00
e1519cf460 build: don't override gopath by default, demote old gopath on override
Builds already vendor through cmd/ so there's no reason to set the GOPATH; it
was also breaking gofail builds. For builds that need to override GOPATH, also
include the old GOPATH as a fallback for dependencies outside cmd/vendor/.
2016-08-16 13:46:07 -07:00
8d7703528a Merge pull request #5845 from heyitsanthony/clientv3-ignore-dead-eps
clientv3: respect up/down notifications from grpc
2016-08-16 11:56:03 -07:00
3eadf964f4 clientv3: use failfast and retry wrappers for at-most-once rpcs 2016-08-16 10:49:50 -07:00
ee3797ddff integration: treat client TLS connecting to insecure server as timeout 2016-08-16 10:17:16 -07:00
46765ad79c clientv3: respect up/down notifications from grpc
Fixes #5842
2016-08-16 09:49:36 -07:00
462eb511c5 Merge pull request #6183 from heyitsanthony/go-install-etcd
build: support go install github.com/coreos/etcd/cmd/etcd
2016-08-15 16:29:36 -07:00
b125d590cf Merge pull request #6186 from gyuho/grpcproxy-fix
proxy/grpcproxy: fix nil-map assign to 'singles'
2016-08-15 16:25:02 -07:00
b9d01fb98b vendor: update grpc 2016-08-15 16:19:40 -07:00
a4ef36c8bf proxy/grpcproxy: fix nil-map assign to 'singles' 2016-08-15 15:48:45 -07:00
d5d2370fc8 Merge pull request #6172 from xiang90/session
session: remove session manager and add ttl
2016-08-15 15:20:19 -07:00
961b03420e Merge pull request #6185 from heyitsanthony/wait-time-collision
wait: make WaitTime robust against deadline collisions
2016-08-15 15:15:29 -07:00
16b2d9ca5e Merge pull request #6170 from heyitsanthony/default-advertise-ip
use default ip for advertise URL
2016-08-15 15:12:25 -07:00
449923c98b build: support go install github.com/coreos/etcd/cmd/etcd
Could build via github.com/coreos/etcd/cmd but that would generate a binary
named "cmd", which is not ideal.
2016-08-15 15:08:41 -07:00
7b84456366 Merge pull request #6163 from gyuho/vendor
vendor: migrate to glide + update go-systemd, probing
2016-08-15 15:01:09 -07:00
c3f069c9fc wait: make WaitTime robust against deadline collisions 2016-08-15 14:38:41 -07:00
0307382c1a Merge pull request #6184 from xiang90/rm
ROADMAP: update
2016-08-15 14:30:23 -07:00
db834301eb ROADMAP: update 2016-08-15 14:30:10 -07:00
feaff17259 session: remove session manager and add ttl 2016-08-15 14:12:25 -07:00
2cc245e8bf etcdmain: report default advertise detection / fallback 2016-08-15 14:08:09 -07:00
29372f9dd2 vendor: update go-systemd, probing 2016-08-15 14:04:07 -07:00
ddf65421e7 scripts: use glide in updatedep.sh 2016-08-15 14:04:03 -07:00
b207dd095c glide: initial commit 2016-08-15 12:10:32 -07:00
d5900e8b63 vendor: migrate to glide 2016-08-15 12:10:21 -07:00
e810dec662 Merge pull request #6182 from gyuho/fix
rafthttp: use reportCriticalError, fix typo
2016-08-15 11:48:30 -07:00
e8594b60b1 embed: use default route IP for default advertise URL
Fixes #2858
2016-08-15 11:12:26 -07:00
d23392ed8e netutil: GetDefaultHost for getting the default IP of the host machine 2016-08-15 11:12:26 -07:00
bd450c1ba3 rafthttp: use reportCriticalError, fix typo 2016-08-15 10:40:58 -07:00
561c3b918a Merge pull request #6179 from ypu/binDir
e2e: Update binary path with binDir
2016-08-15 10:36:04 -07:00
9eb6ea34bd Merge pull request #6175 from heyitsanthony/fix-conn-race
rafthttp: fix race between streamReader.stop() and connection closer
2016-08-15 09:27:24 -07:00
d0d8e49e20 e2e: Update binary path with binDir
Signed-off-by: Yiqiao Pu <ypu@redhat.com>
2016-08-15 17:22:42 +08:00
911c8442b7 rafthttp: fix race between streamReader.stop() and connection closer 2016-08-15 01:36:09 -07:00
96e018634a Merge pull request #6173 from gyuho/ccc
pkg/httputil: simplify RequestCanceler args
2016-08-14 20:41:19 -07:00
f14fd43548 proxy/httpproxy: fix httputil.RequestCanceler 2016-08-14 14:37:08 -07:00
0503676bde rafthttp: fix httputil.RequestCanceler 2016-08-14 14:36:51 -07:00
ae4b4109b2 pkg/httputil: simplify RequestCanceler args 2016-08-14 14:35:50 -07:00
1b5a129bbe Merge pull request #6171 from gyuho/go-vet
*: fix spell errors from go report card
2016-08-13 23:10:17 -07:00
19b35c939a proxy/grpcproxy: fix spell 'gropu' to 'group' 2016-08-13 20:55:15 -07:00
4d3b281369 etcdserver: fix spell errors 2016-08-13 20:54:48 -07:00
6b671b88dc etcdctl/ctlv3: fix spell errors 2016-08-13 20:54:27 -07:00
d788eb8d92 Merge pull request #6038 from gyuho/leader
*: transfer leadership when stopping leader
2016-08-13 14:47:54 -07:00
a205242ca5 integration: add 'TestTransferLeader/Stop' 2016-08-13 14:32:01 -07:00
64a0e34602 etcdserver: transfer leadership when stopping 2016-08-13 14:31:58 -07:00
7b11c288fe Merge pull request #6169 from sinsharat/master
etcdserver: optimized veryfying local member
2016-08-12 19:09:55 -07:00
1fec4ba127 etcdserver: optimized veryfying local member
moved the code for perparing and sorting of advertising peer urls and
sorting of peer urls only when strict verification needs to be done.
This is done to avoid this processing when strict verification is not
required like in case of VerifyJoinExisting function.

#6165
2016-08-13 06:17:21 +05:30
817de6d212 Merge pull request #6168 from heyitsanthony/fix-periodic-test-block
compactor: wait for After() in TestPeriodic
2016-08-12 13:54:06 -07:00
5eff6fb7db compactor: wait for After() in TestPeriodic
If the test calls clock.Advance() after the compactor checks clock.Now()
but before the compactor calls clock.After(), the compactor will wait
forever on clock.After() expecting the lost clock.Advance().

Reproduced failure by putting a Sleep() in the clock.Now() continue path.

Fixes #6060 (again)
2016-08-12 13:28:40 -07:00
f975fe8068 Merge pull request #6140 from gyuho/network-partition
*: add network partition tests
2016-08-12 12:33:24 -07:00
0a00328a7c integration: add network partition tests 2016-08-12 12:15:29 -07:00
82a3d90763 Merge pull request #6167 from xiang90/fix_txn_rev
etcdserver: fix wrong rev in header when nothing is actually got executed
2016-08-12 12:14:48 -07:00
92a0f08722 etcdserver: fix wrong rev in header when nothing is actually got executed 2016-08-12 11:44:13 -07:00
67b1c7cce5 Merge pull request #6166 from heyitsanthony/clientv3-nonblock-new
clientv3: support non-blocking New()
2016-08-12 10:57:45 -07:00
429d5ab20b clientv3: only block on New() when DialTimeout > 0
Fixes #6162
2016-08-12 10:33:11 -07:00
c6c6cfb502 etcdserver: implement 'CutPeer', 'MendPeer' 2016-08-12 07:38:52 -07:00
c33ea20fef Merge pull request #6161 from sinsharat/master
etcdserver: stats/server - refactored
2016-08-11 17:03:23 -07:00
965b2901d5 Merge pull request #6156 from heyitsanthony/remove-member-quorum
etcdserver: reject member removal that breaks active quorum
2016-08-11 11:40:38 -07:00
aa9837e8ff e2e: support --strict-reconfig-check=false 2016-08-11 11:14:14 -07:00
e742ff331f integration: test member removal which breaks active quorum is rejected 2016-08-11 11:14:14 -07:00
6205a9a6cb etcdserver: stats/server - refactored
removed code duplicacy and improved readability

#6160
2016-08-11 22:09:25 +05:30
de06dc1272 Merge pull request #6155 from gyuho/raft-leader-transfer
*: expose Raft leader transfer
2016-08-11 08:03:28 -07:00
d3812ed664 Merge pull request #6157 from siddontang/siddontang/fix-overflow
raft: fix overflow
2016-08-11 07:53:48 -07:00
f8ee322b08 raft: fix overflow 2016-08-11 09:24:49 +08:00
8a32929d29 Merge pull request #6154 from gyuho/rafthttp-pause
rafthttp: add Transport.Cut/MendPeer
2016-08-10 17:10:30 -07:00
937ae658dd rafthttp: add Transport.Cut/MendPeer
From https://github.com/coreos/etcd/pull/6140.
2016-08-10 17:09:35 -07:00
a1ce07a321 etcdserver: reject member removal that breaks the current active quorum 2016-08-10 17:00:39 -07:00
a56cb82180 etcdserver: add TransferLeadership for raft.Node 2016-08-10 16:26:11 -07:00
e64ef3f261 raft: add 'TransferLeadership' to Node interface 2016-08-10 16:25:22 -07:00
f4141f0f51 raft: handle 'MsgTransferLeader' in follower 2016-08-10 16:24:29 -07:00
d72cee1b0c Merge pull request #6153 from gyuho/example
clientv3: update base example with TLS
2016-08-10 14:53:42 -07:00
1644679d00 clientv3: add 'ExampleConfig_withTLS' 2016-08-10 14:37:34 -07:00
7eb43ea75b Merge pull request #6152 from xiang90/fix_count
mvcc: fix count
2016-08-10 11:42:10 -07:00
f5549cba2a Merge pull request #6151 from heyitsanthony/configfile-defaults
embed: load config defaults before loading from file
2016-08-10 11:27:57 -07:00
de864d3b58 mvcc: fix count 2016-08-10 10:54:25 -07:00
2bb1f9c8a4 Merge pull request #6150 from gyuho/metrics
etcdserver: use Counter for proposals_failed_total
2016-08-10 09:59:10 -07:00
eb97aba581 e2e: test etcd boots with example config file 2016-08-10 09:45:17 -07:00
6de993b468 embed: load config defaults before loading config from file 2016-08-10 09:44:50 -07:00
06e2338108 Merge pull request #6113 from ypu/e2e
Add some test flags for e2e test
2016-08-10 09:28:27 -07:00
d219e96359 etcdserver: use Counter for proposals_failed_total
It only ever goes up.
2016-08-10 09:27:51 -07:00
b6f5b6b1c9 Merge pull request #6147 from sinsharat/master
etcdserver: Error handling for invalid empty raft cluster
2016-08-10 08:52:45 -07:00
2b5a5c77cf etcdserver: Error handling for invalid empty raft cluster
TODO implemented for GetClusterFromRemotePeers should not return nil
error with an invalid empty cluster

#6137
2016-08-10 19:23:19 +05:30
a5e4fbd335 e2e: Make the certificate file path configurable
This commit will help us to run the e2e tests in an enviroment
without e2e source code more convenient.

Signed-off-by: Yiqiao Pu <ypu@redhat.com>
2016-08-10 15:40:12 +08:00
2ca87f6c03 e2e: Make it can run with exist binary
Add the bin-dir option to the command line, so the e2e tests can
run with an exist binary. For example(run the command under e2e
directory):
go test -v -timeout 10m -bin-dir /usr/bin -cpu 1,2,4

Signed-off-by: Yiqiao Pu <ypu@redhat.com>
2016-08-10 15:40:12 +08:00
81f5e31ed2 Merge pull request #6142 from heyitsanthony/fix-cancel-watch-imm
clientv3: handle watchGrpcStream shutdown if prior to goroutine start
2016-08-09 20:53:56 -07:00
2d3eda4afa Merge pull request #6139 from aaronlehmann/export-segment-size
wal: Export SegmentSizeBytes as a variable
2016-08-09 20:39:56 -07:00
1c83a46c6d clientv3: handle watchGrpcStream shutdown if prior to goroutine start
Fixes #6141
2016-08-09 19:59:04 -07:00
2b996b6038 wal: Export SegmentSizeBytes as a variable
In test situations, it's useful to create smaller than usual WAL files
to test rotation and to avoid the overhead of preallocation on old-style
filesystems that don't handle it efficiently. This commit changes
segmentSizeBytes to an exported variable so that tests can override it
from an init() function.
2016-08-09 15:38:30 -07:00
88a77f30e1 Merge pull request #6136 from heyitsanthony/fix-watcher-leak
clientv3: close watcher stream once all watchers detach
2016-08-09 10:23:15 -07:00
8c1c291332 clientv3/integration: test watcher cancelation propagation to server 2016-08-09 00:10:57 -07:00
5e651a0d0d clientv3: close watcher stream once all watchers detach
Fixes #6134
2016-08-09 00:10:57 -07:00
c3c41234f1 integration: support querying member metrics 2016-08-08 23:45:50 -07:00
c7e4198742 Merge pull request #6129 from xiang90/fix_raft
raft: fix getting unapplied log entries
2016-08-08 16:30:42 -07:00
8f3a11c73c Merge pull request #6105 from gyuho/release-notes
NEWS: add release notes for >v3.0.0 releases
2016-08-08 15:39:14 -07:00
f58a119b44 Merge pull request #6132 from gyuho/manual
Documentation/dev-guide: add bash syntax to doc
2016-08-08 15:26:14 -07:00
adbd936f22 Documentation/dev-guide: add bash syntax to doc 2016-08-08 15:06:02 -07:00
39f39c185e NEWS: add release notes for >v3.0.0 releases
Fix https://github.com/coreos/etcd/issues/6049.
2016-08-08 15:01:17 -07:00
918af500c3 Merge pull request #6130 from gyuho/port-e2e
e2e: use unix port for release tests
2016-08-08 14:41:09 -07:00
311c19e494 e2e: use unix port for release tests
Fix https://github.com/coreos/etcd/issues/5947.

When we restart, the previous port could have been still bind
by the OS. Use Unix port to avoid such rebind cases.
2016-08-08 14:26:19 -07:00
5f0c122496 raft: fix getting unapplied log entries 2016-08-08 10:44:02 -07:00
bb28c9ab00 Merge pull request #6126 from gyuho/tester
etcd-tester: fix tester for 5-node cluster
2016-08-07 21:58:49 -07:00
c6cf015e26 etcd-tester: fix tester for 5-node cluster
1. fix failure case counting
2. match ErrClientConnClosing in stresser
3. longer timeout for set-health-key
4. fixed range for range/delete stresser
5. remove Limit in RangeRequest
2016-08-07 21:15:01 -07:00
fb7c4da361 Merge pull request #6124 from heyitsanthony/share-limiter
functional-tester: share limiter among stresser
2016-08-07 19:17:06 -07:00
978ae9de29 functional-tester: share limiter among stresser
Otherwise, adding more members stresses the cluster with more ops.
2016-08-07 19:15:00 -07:00
7678b84f2c Merge pull request #6123 from xiang90/fix_limiter
tools/functional-tester: fix limiter
2016-08-07 16:20:17 -07:00
619a40b22b Merge pull request #6122 from xiang90/debug_stresser
tools/functional-tester: better logging
2016-08-07 16:17:52 -07:00
f6a1585902 functional-tester: reduce rate to 3000 2016-08-07 14:34:01 -07:00
107a07563f tools/functional-tester: fix limiter 2016-08-07 14:28:16 -07:00
69204397ee tools/functional-tester: better logging 2016-08-07 14:21:44 -07:00
f505bcb91a Merge pull request #6117 from gyuho/lease-test
integration: add more lease tests
2016-08-05 19:25:07 -07:00
f1f31f1015 integration: add more lease tests
Fix https://github.com/coreos/etcd/issues/6102.
2016-08-05 19:09:46 -07:00
c71f0ea174 Merge pull request #6106 from heyitsanthony/strict-reconfig-healthy
etcdserver, embed: stricter reconfig checking
2016-08-05 17:15:01 -07:00
9063ce5e3f etcdserver, embed: stricter reconfig checking
Make --strict-reconfig-check a default and check if cluster is healthy when
adding a member.
2016-08-05 16:59:25 -07:00
9764652356 Merge pull request #6081 from gyuho/functional-tester
etcd-tester: delete/range with limit, clean up
2016-08-05 11:28:41 -07:00
854a215329 etcd-tester: delete/range with limit, clean up 2016-08-05 11:21:36 -07:00
4a7fabd219 Merge pull request #6098 from xiang90/lease
Fix Lease
2016-08-05 10:08:24 -07:00
6c3efde51b Merge pull request #6099 from sinsharat/master
raft: handling of applying old snapshots
2016-08-05 07:38:07 -07:00
d69d438289 *: minor cleanup for lease 2016-08-04 20:39:32 -07:00
7ed8a133d2 Merge pull request #6104 from gyuho/typo
pkg/transport: fix minor typo
2016-08-04 16:03:40 -07:00
c38f0290a7 pkg/transport: fix minor typo 2016-08-04 16:00:18 -07:00
c46955b60a Merge pull request #6097 from swingbach/master
raft: fix #6096
2016-08-04 11:40:02 -07:00
e2a956c0c4 Merge pull request #6100 from gyuho/sort-comment
clientv3: ignore sort-ascend-key option
2016-08-04 11:28:49 -07:00
bd62b0a646 mvcc: attach keys to leases after recover all state
The previous logic is wrong. When we have hisotry like Put(foo, bar, lease1),
and Put(foo, bar, lease2), we will end up with attaching foo to two leases 1 and
2. Similar things can happen for deattach by clearing the lease of a key.

Now we try to fix this by starting to attach leases at the end of the recovery.
We use a map to keep the last lease attachment state.
2016-08-04 11:17:58 -07:00
ddddecc3ab clientv3: ignore sort-ascend-key option 2016-08-04 11:13:41 -07:00
75c06cacae lease: do lease delection in the kv txn 2016-08-04 10:06:47 -07:00
4d59b6f52c lease: delete kvs in a txn 2016-08-04 10:06:46 -07:00
fd757756f5 raft: handling of applying old snapshots
There was a TODO requirement to handle ErrorSnapshotOutOfDate for the
function ApplySnapshot. The same has been implemented

#6090
2016-08-04 21:08:24 +05:30
29a077bdbe etcdserver: always recover lessor first 2016-08-04 08:06:19 -07:00
41dee84733 raft: fix #6096 2016-08-04 18:31:22 +08:00
eb36d0dbba Merge pull request #6084 from heyitsanthony/srv-servername
etcdctl: set TLS servername on discovery
2016-08-03 23:51:11 -07:00
a752338d45 Documentation: update clustering guide about PKI SRV record forging 2016-08-03 22:28:03 -07:00
d1809830bb embed: use ServerName on TLS DNS discovery without CA file 2016-08-03 22:28:03 -07:00
ab4ac828f3 etcdmain: check TLS on gateway SRV records 2016-08-03 22:28:03 -07:00
e218834b58 etcdctl: set ServerName for TLS when using --discovery-srv 2016-08-03 22:28:03 -07:00
cd781bf30c transport: add ServerName to TLSConfig and add ValidateSecureEndpoints
ServerName prevents accepting forged SRV records with cross-domain
credentials. ValidateSecureEndpoints prevents downgrade attacks from SRV
records.
2016-08-03 22:28:03 -07:00
6e7baab32c Merge pull request #6070 from swingbach/master
raft: fix #6068
2016-08-03 19:59:07 -07:00
cabd28516c Merge pull request #6092 from gyuho/transport
pkg/transport: update scheme to unix without copy
2016-08-03 10:59:00 -07:00
c8cc87c3f5 pkg/transport: update scheme to unix copying URL 2016-08-03 10:35:28 -07:00
bc9882f521 Merge pull request #6087 from xiang90/grpc_create
grpcproxy: handle create event
2016-08-03 09:31:33 -07:00
57c68ab1db grpcproxy: handle create event 2016-08-02 20:51:30 -07:00
c30a436829 Merge pull request #6086 from xiang90/sc
clientv3: add send created notification
2016-08-02 20:27:04 -07:00
33c3583b50 clientv3: add send created notification 2016-08-02 20:08:11 -07:00
76e62c39b0 Merge pull request #6085 from heyitsanthony/lease-elect-timeout
etcdserver, lease: tie lease min ttl to election timeout
2016-08-02 13:27:04 -07:00
bf71497537 etcdserver, lease: tie lease min ttl to election timeout 2016-08-02 13:06:57 -07:00
c0a8da7fd0 raft: minor refactor 2016-08-02 08:46:43 +08:00
4db07dbc93 Merge pull request #6079 from gyuho/cleanup-functional-tester
etcd-tester: remove unnecessary arg from stresser
2016-08-01 15:40:50 -07:00
755eee0d30 etcd-tester: remove unnecessary arg from stresser 2016-08-01 15:35:31 -07:00
b23045e34d Merge pull request #6078 from gyuho/release-note
dev-internal: update release note
2016-08-01 15:14:05 -07:00
fc4b30a1e0 dev-internal: update release note
For https://github.com/coreos/etcd/issues/6049.
2016-08-01 15:09:47 -07:00
9836990aa7 Merge pull request #6077 from gyuho/auth-guest
v2http: use guest access in non-TLS mode
2016-08-01 14:32:46 -07:00
87498e0209 v2http: use guest access in non-TLS mode
Fix https://github.com/coreos/etcd/issues/6075.
2016-08-01 14:00:38 -07:00
59ac42ff38 Merge pull request #6073 from heyitsanthony/rafthttp-close-stream
rafthttp: close http socket when pipeline handler gets a raft error
2016-07-31 21:49:04 -07:00
911dcc9386 rafthttp: close http socket when pipeline handler gets a raft error
Otherwise the http stream remains open and keeps receiving raft messages.
This can lead to "raft: stopped" log spam on closing an embedded server.

Fixes #5981
2016-07-31 20:25:42 -07:00
a2715e3bda Merge pull request #6072 from xiang90/tls_err
Log TLS error in health checking
2016-07-31 20:17:47 -07:00
9311d7b77e rafthttp: log health checking error early 2016-07-31 19:58:22 -07:00
5a83f05e96 dep: update probing 2016-07-31 18:24:00 -07:00
a60387bab2 Merge pull request #6001 from mitake/auth-errcode
client, etcdserver: propagate status code of auth related error
2016-07-31 08:28:41 -07:00
564bf8d17e client: utility functions for getting detail of v2 auth errors
Current v2 auth API doesn't propagate its error code. This commit adds
utility functions for parsing error messages and getting detail of v2
auth errors.

Fixes https://github.com/coreos/etcd/issues/5894
2016-07-31 21:23:58 +09:00
4d309f0cb7 Merge pull request #6054 from heyitsanthony/serialize-refactor
etcdserver: apply serialized requests outside auth apply lock
2016-07-30 22:44:26 -07:00
06da46c4ee etcdserver: apply serialized requests outside auth apply lock
Fixes #6010
2016-07-30 22:00:49 -07:00
b43722dd48 Merge pull request #6069 from xiang90/raft_doc
raft: better doc
2016-07-30 21:11:57 -07:00
8d12017fe2 raft: better doc 2016-07-30 21:11:37 -07:00
992f628e6e raft: fix #6068 2016-07-30 03:27:29 +08:00
e2088b8073 Merge pull request #6063 from siddontang/siddontang/embed-handler
embed: support registering client handlers
2016-07-27 22:57:27 -07:00
86de0797e1 embed: support registering user handlers 2016-07-28 13:39:06 +08:00
72eb2d8893 Merge pull request #6064 from heyitsanthony/clientv3-watch-filter
clientv3: watch filters
2016-07-27 21:48:25 -07:00
4c9a2a65c9 integration: test clientv3 watch filters 2016-07-27 21:25:06 -07:00
943fe70178 clientv3: support watch filters 2016-07-27 21:24:52 -07:00
79d25a6884 Merge pull request #6061 from heyitsanthony/fix-snapshot-test
etcdserver: don't race when waiting for store in TestSnapshot
2016-07-27 19:15:41 -07:00
3d8e4ace47 Merge pull request #6062 from heyitsanthony/fix-test-periodic
compactor: fix race in TestPeriodic
2016-07-27 19:15:27 -07:00
76a99fa1c3 compactor: fix race in TestPeriodic
Test ordering now similar to TestPeriodicPause

Fixes #6060
2016-07-27 16:03:22 -07:00
1153350a95 Merge pull request #6059 from jlamillan/honor_global_output
etcdctl: Add support for formating output of key related commands
2016-07-27 16:03:17 -07:00
cfe09d34b8 etcdserver: don't race when waiting for store in TestSnapshot 2016-07-27 15:37:27 -07:00
205f10aeb6 etcdctl: Add support for formating output of key related commands
All v2 key and dir related commands will now honor the global format option if
it was specified. Otherwise, the output will remain the same.
2016-07-27 14:17:19 -07:00
6136b26f38 Merge pull request #6056 from gyuho/gateway
scripts/genproto: use latest grpc-gateway c8ec92d0
2016-07-27 13:37:35 -07:00
273c6f6ba9 Merge pull request #6058 from gyuho/dockerfile-release
Dockerfile-release: add '/var/lib/etcd/'
2016-07-27 13:37:26 -07:00
de99dfb134 Dockerfile-release: add '/var/lib/etcd/'
We have '/var/etcd/' in Dockerfile for historical reason.
Most cases, user store data in '/var/lib/etcd/'.
2016-07-27 13:24:07 -07:00
982e18d80b *: regenerate proto with latest grpc-gateway 2016-07-27 13:21:03 -07:00
6e95ce26fb scripts/genproto: use latest grpc-gateway c8ec92d0 2016-07-27 13:20:15 -07:00
13c2d32061 Merge pull request #6045 from heyitsanthony/fix-version-race
etcdserver, api, membership: don't race on setting version
2016-07-27 08:56:39 -07:00
a75688bd17 Merge pull request #6039 from xiang90/fix_r
raft: hide Campaign rules on applying all entries
2016-07-26 20:52:09 -07:00
3c3b33b00f Merge pull request #5911 from mitake/skip-apply-txn
etcdserver: skip range requests in txn if the result is needless
2016-07-26 20:48:41 -07:00
0090573749 etcdserver: skip range requests in txn if the result is needless
If a server isn't serving txn requests from a client, the server
doesn't need the result of range requests in the txn.

This is a succeeding commit of
https://github.com/coreos/etcd/pull/5689
2016-07-26 19:49:07 -07:00
de2c3ec3db etcdserver, api, membership: don't race on setting version
Fixes #6029
2016-07-26 18:21:40 -07:00
640d511684 Merge pull request #6047 from gyuho/doc
Documentation: fix links in upgrades
2016-07-26 12:54:14 -07:00
914e9266cb Documentation: fix links in upgrades 2016-07-26 12:51:59 -07:00
0d6c028aa2 Merge pull request #6032 from xiang90/gateway
fix a few issues in grpc gateway
2016-07-25 16:48:38 -07:00
484f579905 raft: hide Campaign rules on applying all entries 2016-07-25 15:53:39 -07:00
864947a825 Merge pull request #6037 from heyitsanthony/disable-tracing
etcdmain: disable grpc tracing by default
2016-07-25 15:20:28 -07:00
d6b22323a8 etcdmain: disable grpc tracing by default 2016-07-25 14:23:36 -07:00
6079be7dae Merge pull request #6036 from heyitsanthony/fix-embed-defaults
embed: add listen urls to default config
2016-07-25 12:49:45 -07:00
537057bd11 Merge pull request #6033 from heyitsanthony/watch-adapter
integration: support watch with cluster_proxy tag
2016-07-25 11:34:15 -07:00
42fc36b4d6 embed: add listen urls to default config
Was only setting the advertise urls.
2016-07-25 11:06:03 -07:00
7f0f9795bf Merge pull request #6028 from xiang90/plat
doc: update platform.md
2016-07-25 09:58:55 -07:00
2b4c37f54a grpcproxy: don't leak goroutines on watch proxy shutdown 2016-07-25 09:34:36 -07:00
418bb5e176 grpcproxy: bind clientv3.Watcher on initialization 2016-07-25 09:34:36 -07:00
1cad722a6d integration: support watch apis in cluster_proxy build 2016-07-25 09:34:36 -07:00
ac96963003 clientv3: support creating a Watch from a WatchClient 2016-07-25 09:34:36 -07:00
4fa9363aca grpcproxy: client watch adapter 2016-07-25 09:34:36 -07:00
020a24f1c3 *: regenerate proto for handling eof error 2016-07-23 16:21:44 -07:00
38b69a9301 scripts:genproto.sh: update grpc-gateway 2016-07-23 16:18:42 -07:00
fffa484a9f *: regenerate proto for adding deleterange 2016-07-23 16:17:44 -07:00
b4ce427d45 etcdserverpb: add missing deleterange annotation 2016-07-23 15:59:53 -07:00
116a1b5855 Merge pull request #6031 from gyuho/vet-fix
grpcproxy: define 'watchergroups' in pointer
2016-07-22 17:39:36 -07:00
abbefc9e25 grpcproxy: define 'watchergroups' in pointer
To avoid copying mutex lock values
2016-07-22 16:54:11 -07:00
5b288f6cd1 Merge pull request #6030 from gyuho/raft-raft
raft: replace 'reflect.DeepEqual' with bytes.Equal
2016-07-22 16:53:11 -07:00
4ff6c72257 raft: replace 'reflect.DeepEqual' with bytes.Equal 2016-07-22 16:34:13 -07:00
8f4a36fd32 doc: update platform.md 2016-07-22 11:24:19 -07:00
ec5c5d9ddf Merge pull request #6021 from xiang90/gateway_test
e2e: add gateway test
2016-07-21 16:48:04 -07:00
c603b5e6a1 e2e: add gateway test 2016-07-21 16:19:54 -07:00
2bf55e3a15 Merge pull request #6016 from endocode/kayrus/fix_serve_err_return
embed: Fixed serve() err return
2016-07-21 11:17:08 -07:00
fee9e2b183 embed: Fixed serve() err return 2016-07-21 18:06:08 +02:00
de638a5e4d Merge pull request #5991 from gyuho/manual
v2http: client certificate auth via common name
2016-07-21 08:02:17 -07:00
214c1e55b0 Merge pull request #5999 from jlamillan/master
Add support for formating output of ls command in json or extended fo…
2016-07-21 07:09:52 -07:00
32553c5796 Merge pull request #6006 from dongsupark/dongsu/fix-build-error-go-systemd
etcdmain: correctly check return values from SdNotify()
2016-07-21 07:08:58 -07:00
624187d25f etcdmain: correctly check return values from SdNotify()
SdNotify() now returns 2 values, sent and err. So startEtcdOrProxyV2()
needs to check the 2 return values correctly. As the 2 values are
independent of each other, error checking needs to be slightly updated
too.

SdNotifyNoSocket, which was previously provided by go-systemd, does not
exist any more. In that case (false, nil) will be returned instead.
2016-07-21 09:19:07 +02:00
00c9fe4753 vendor: update go-systemd
Godeps.json and vendor need to be updated according to the newest
go-systemd, as SdNotify() in go-systemd has changed its API.
2016-07-21 08:20:52 +02:00
f18d5433cc etcdctl: Add support for formating output of ls command in json
The ls command will check for and honor json or extended output formats.

Fixes #5993
2016-07-20 18:05:23 -07:00
42db8f55b2 e2e: test auth enabled with CN name cert 2016-07-20 16:55:45 -07:00
e001848270 Merge pull request #5772 from heyitsanthony/integration-proxy
integration: build tag for proxy
2016-07-20 16:28:12 -07:00
5066981cc7 v2http: test with 'ClientCertAuthEnabled' 2016-07-20 16:24:33 -07:00
25aeeb35c3 v2http: set 'ClientCertAuthEnabled' in client.go 2016-07-20 16:24:15 -07:00
68ece954fb v2http: add 'ClientCertAuthEnabled' in handlers 2016-07-20 16:23:41 -07:00
be001c44e8 embed: set 'ClientCertAuthEnabled' 2016-07-20 16:23:24 -07:00
9510bd6036 etcdserver: add 'ClientCertAuthEnabled' option 2016-07-20 16:22:59 -07:00
0f0d32b073 v2http: move 'testdata' from 'etcdhttp' 2016-07-20 16:20:42 -07:00
ff5709bb41 v2http: client cert cn authentication
introduce client certificate authentication using certificate cn.
2016-07-20 16:20:13 -07:00
ab17165352 v2http: refactor http basic auth
refactor http basic auth code to combine basic auth extraction and validation
2016-07-20 16:20:05 -07:00
768ccb8c10 grpcproxy: respect prev_kv flag 2016-07-20 15:58:33 -07:00
becbd9f3d6 test: grpcproxy integration test pass
Run via
PASSES=grpcproxy ./test
2016-07-20 15:58:33 -07:00
7b3d502b96 integration: build tag cluster_proxy for testing backed by proxy 2016-07-20 15:40:33 -07:00
17e0164f57 clientv3: add KV constructor using pb.KVClient 2016-07-20 15:40:33 -07:00
54df540c2c grpcproxy: wrapper from pb.KVServer to pb.KVClient 2016-07-20 15:40:33 -07:00
15aa64eb3c Merge pull request #6009 from heyitsanthony/fix-progress-notify
v3rpc: don't elide next progress notification on progress notification
2016-07-20 13:46:11 -07:00
65d7e7963a Merge pull request #6011 from heyitsanthony/fix-migrate-test
e2e: use a single member cluster in TestCtlV3Migrate
2016-07-20 13:27:17 -07:00
8c8742f43c integration: change timeouts for TestWatchWithProgressNotify
a) 2 * progress interval was passing with dropped notifies
b) waitResponse was waiting so long that it expected a dropped notify
2016-07-20 13:23:44 -07:00
a289bf58e6 e2e: use a single member cluster in TestCtlV3Migrate
Occasionally migrate would fail because a minority node would be missing
v2 keys. Instead, just use a single member cluster.

Fixes #5992
2016-07-20 12:10:09 -07:00
299ebc6137 v3rpc: don't elide next progress notification on progress notification
Fixes #5878
2016-07-20 11:37:20 -07:00
a7b098b26d Merge pull request #6007 from heyitsanthony/fix-watch-test
integration: fix race in TestV3WatchMultipleEventsTxnSynced
2016-07-20 10:34:54 -07:00
82ddeb38b4 integration: fix race in TestV3WatchMultipleEventsTxnSynced
Writes between watcher creation request and reply were being dropped.

Fixes #5789
2016-07-20 09:55:39 -07:00
aba478fb8a Merge pull request #5793 from mitake/auth-revision
auth, etcdserver: introduce revision of authStore for avoiding TOCTOU problem
2016-07-20 09:32:54 -07:00
edcfcae332 Merge pull request #5995 from heyitsanthony/clientv3-retry-stopped
rpctypes, clientv3: retry RPC on EtcdStopped
2016-07-20 08:54:14 -07:00
ef6b74411c auth, etcdserver: introduce revision of authStore for avoiding TOCTOU problem
This commit introduces revision of authStore. The revision number
represents a version of authStore that is incremented by updating auth
related information.

The revision is required for avoiding TOCTOU problems. Currently there
are two types of the TOCTOU problems in v3 auth.

The first one is in ordinal linearizable requests with a sequence like
below ():
1. Request from client CA is processed in follower FA. FA looks up the
   username (let it U) for the request from a token of the request. At
   this time, the request is authorized correctly.
2. Another request from client CB is processed in follower FB. CB
   is for changing U's password.
3. FB forwards the request from CB to the leader before FA. Now U's
   password is updated and the request from CA should be rejected.
4. However, the request from CA is processed by the leader because
   authentication is already done in FA.

For avoiding the above sequence, this commit lets
etcdserverpb.RequestHeader have a member revision. The member is
initialized during authentication by followers and checked in a
leader. If the revision in RequestHeader is lower than the leader's
authStore revision, it means a sequence like above happened. In such a
case, the state machine returns auth.ErrAuthRevisionObsolete. The
error code lets nodes retry their requests.

The second one, a case of serializable range and txn, is more
subtle. Because these requests are processed in follower directly. The
TOCTOU problem can be caused by a sequence like below:
1. Serializable request from client CA is processed in follower FA. At
   first, FA looks up the username (let it U) and its permission
   before actual access to KV.
2. Another request from client CB is processed in follower FB and
   forwarded to the leader. The cluster including FA now commits a log
   entry of the request from CB. Assume the request changed the
   permission or password of U.
3. Now the serializable request from CA is accessing to KV. Even if
   the access is allowed at the point of 1, now it can be invalid
   because of the change introduced in 2.

For avoiding the above sequence, this commit lets the functions of
serializable requests (EtcdServer.Range() and EtcdServer.Txn())
compare the revision in the request header with the latest revision of
authStore after the actual access. If the saved revision is lower than
the latest one, it means the permission can be changed. Although it
would introduce false positives (e.g. changing other user's password),
it prevents the TOCTOU problem. This idea is an implementation of
Anthony's comment:
https://github.com/coreos/etcd/pull/5739#issuecomment-228128254
2016-07-20 14:39:04 +09:00
8abae076d1 rpctypes, clientv3: retry RPC on EtcdStopped
Fixes #5983
2016-07-19 18:29:12 -07:00
6e290abee2 Merge pull request #5998 from heyitsanthony/tls-timeout-conn
transport: wrap timeout listener with tls listener
2016-07-19 17:42:05 -07:00
99e0655c2f transport: wrap timeout listener with tls listener
Otherwise the listener will return timeoutConn's, causing a type
assertion to tls.Conn in net.http to fail so http.Request.TLS is never set.
2016-07-19 16:47:14 -07:00
80c2e4098d Merge pull request #5997 from xiang90/l_r
raft: fix readindex
2016-07-19 15:25:53 -07:00
1c5754f02d raft: fix readindex 2016-07-19 15:00:58 -07:00
e5f0cdcc69 Merge pull request #5984 from xiang90/p_b
grpcproxy: do not send duplicate events to watchers
2016-07-19 12:47:23 -07:00
783675f91c grpcproxy: do not send duplicate events to watchers 2016-07-19 10:14:57 -07:00
d3d954d659 Merge pull request #5990 from xiang90/wcr
clientv3/integration: fix race in TestWatchCompactRevision
2016-07-19 10:08:28 -07:00
e177d9eda2 clientv3/integration: fix race in TestWatchCompactRevision 2016-07-19 09:31:44 -07:00
1bf78476cf Merge pull request #5980 from xiang90/gateway
etcdmian: gateway supports dns srv discovery
2016-07-18 22:10:54 -07:00
c7c5cd324b etcdmian: gateway supports dns srv discovery 2016-07-18 21:53:24 -07:00
fcc96c9ebd Merge pull request #5976 from heyitsanthony/fix-kadc
integration: drain keepalives in TestLeaseKeepAliveCloseAfterDisconnectRevoke
2016-07-18 20:21:44 -07:00
d914502090 Merge pull request #5978 from heyitsanthony/fix-compactor
compactor: make event ordering well-defined in TestPeriodicPause
2016-07-18 20:06:57 -07:00
27a30768e1 integration: drain keepalives in TestLeaseKeepAliveCloseAfterDisconnectRevoke
Fixes #5900
2016-07-18 19:45:59 -07:00
a1d823c2aa compactor: make event ordering well-defined in TestPeriodicPause
Fixes #5847
2016-07-18 19:45:31 -07:00
a61862acc7 Merge pull request #5977 from xiang90/b_proxy
grpcproxy: return interface
2016-07-18 19:12:43 -07:00
5cccb49498 Merge pull request #5979 from heyitsanthony/unix-embed
embed: support unix sockets
2016-07-18 17:05:58 -07:00
5271cf0160 grpcproxy: return interface 2016-07-18 16:47:58 -07:00
8d897fd51f integration: use unix sockets in TestEmbedEtcd
Was getting tcp port conflicts in semphore even after assigning unique ports.

Fixes #5953
2016-07-18 16:42:08 -07:00
e177f391f2 embed: support unix peers 2016-07-18 16:41:41 -07:00
32ed0aa0b3 Merge pull request #5626 from gyuho/stresser
etcd-tester: stress with range, delete
2016-07-18 15:26:34 -07:00
969bcd282b etcd-tester: stress with range, delete 2016-07-18 15:17:08 -07:00
7fbc1e39a6 Merge pull request #5973 from heyitsanthony/purge-test
fileutil: rework purge tests so they don't poll
2016-07-18 14:59:19 -07:00
7bfe75cbf3 Merge pull request #5963 from xiang90/p_filter
grpcproxy: add filter to watcher
2016-07-18 14:56:10 -07:00
3a5e418ff9 Merge pull request #5974 from xiang90/a_proxy
grpcproxy: add auth
2016-07-18 14:55:13 -07:00
cae56f583e Merge pull request #5975 from bts/restrict-channel-types-in-demo
contrib/raftexample: Restrict commit/error channel types in raftNode
2016-07-18 14:54:56 -07:00
e1892e264d grpcproxy: add auth 2016-07-18 14:26:22 -07:00
851d69181d Merge pull request #5972 from xiang90/m_proxy
grpcproxy: add maintenance proxy
2016-07-18 14:24:22 -07:00
b86e723107 contrib/raftexample: Restrict channel types 2016-07-18 17:19:54 -04:00
c920ce0453 fileutil: rework purge tests so they don't poll
Fixes #5966
2016-07-18 14:19:09 -07:00
fd24340903 grpcproxy: add maintenance proxy 2016-07-18 13:31:03 -07:00
58aa3483c3 grpcproxy: add filter to watcher 2016-07-18 13:02:34 -07:00
6dbdf6e55f Merge pull request #5958 from xiang90/lease_proxy
*: add lease proxy
2016-07-18 12:57:14 -07:00
3f74e9db0d *: add lease proxy 2016-07-18 12:06:59 -07:00
b61f882635 Merge pull request #5962 from xiang90/c_p
*: add cluster proxy
2016-07-18 11:55:35 -07:00
1c8b30dbdb Merge pull request #5957 from heyitsanthony/wait-panic
testutil, clientv3: wait for panics in txn tests to complete
2016-07-18 11:18:23 -07:00
dc80ae86d9 Merge pull request #5969 from gyuho/vendor-fix
*: fix 'gogo/protobuf' compatibility issue
2016-07-18 10:56:34 -07:00
8893ab0198 Merge pull request #5965 from endocode/kayrus/build_env
build: allow to build outside the etcd directory
2016-07-18 10:36:27 -07:00
984badeb03 testutil, clientv3: wait for panics in txn tests to complete
Fixes #5901
2016-07-18 09:37:33 -07:00
50be793f09 *: regenerate proto 2016-07-18 09:33:32 -07:00
e7c1594c82 vendor: update 'gogo/protobuf' 2016-07-18 09:33:09 -07:00
6e53f75092 scripts: update gogo/protobuf, use 'gofast' plugin
- Fix https://github.com/coreos/etcd/issues/5942
- Partial fix for https://github.com/coreos/etcd/issues/5865
2016-07-18 09:31:27 -07:00
cab2e45319 build: allow to build outside the etcd directory
And added gopath hack which allows to build without setting any GOPATH
env. Just run the build script when you have installed golang.
2016-07-18 17:40:08 +02:00
336e4f2f28 Merge pull request #5960 from xiang90/a_i
etcdserver: set applied index correctly
2016-07-16 19:56:27 -07:00
d9e939d5d1 Merge pull request #5961 from heyitsanthony/test-e2e-unsupported
e2e: run e2e tests on unsupported architectures
2016-07-16 17:12:04 -07:00
52764f1e5a Merge pull request #5959 from heyitsanthony/build-same-place-xarch
build: build cross-compiled binaries in bin/ by default
2016-07-16 17:11:57 -07:00
bdfbd26e94 *: add cluster proxy 2016-07-16 12:15:32 -07:00
2d761d64a4 etcdserver: set applied index correctly 2016-07-16 11:44:18 -07:00
884452c403 e2e: run e2e tests on unsupported architectures 2016-07-16 10:30:19 -07:00
cb9ee7320b build: build cross-compiled binaries in bin/ by default
Otherwise GOARCH=386 PASSES="build integration" ./test fail on amd64
because the e2e tests can't find the binaries. Added a BINDIR option
for writing the build output to somewhere else, in case it's needed.
2016-07-16 10:21:25 -07:00
331ec82400 Merge pull request #5955 from gyuho/port
integration: new ports for embed test
2016-07-15 20:19:33 -07:00
4a5795b55f integration: new ports for embed test 2016-07-15 16:47:32 -07:00
04155423f5 Merge pull request #5956 from xiang90/fix_renew
*: fix issue found in fast lease renew
2016-07-15 15:53:59 -07:00
4835322aa1 Merge pull request #5954 from heyitsanthony/fix-embed-cfg-validate
embed: fix nil dereference on error to set up initial cluster
2016-07-15 15:41:40 -07:00
b26f1bb2b6 Merge pull request #5951 from gyuho/vendor
*: update grpc-gateway and its import paths
2016-07-15 15:33:18 -07:00
93e3112471 Merge pull request #5910 from xiang90/grpc_proxy
grpcproxy: initial watch proxy
2016-07-15 15:12:21 -07:00
3839a55910 *: fix issue found in fast lease renew 2016-07-15 15:07:15 -07:00
34602b87ec embed: fix nil dereference on error to set up initial cluster 2016-07-15 14:43:00 -07:00
5f3aa43899 grpcproxy: initial watch proxy 2016-07-15 14:30:45 -07:00
ecebe7b979 vendor: change to 'grpc-ecosystem' from 'gengo' 2016-07-15 13:29:05 -07:00
5b92e17e86 *: regenerate proto files 2016-07-15 13:24:19 -07:00
4a7b730e69 scripts: update genproto with grpc-ecosystem 2016-07-15 13:21:41 -07:00
4ec94989cf Documentation: change to grpc-ecosystem 2016-07-15 12:11:30 -07:00
b2b98399fb embed: change import path to 'grpc-ecosystem' 2016-07-15 12:10:38 -07:00
1ba7bb237f Merge pull request #5939 from heyitsanthony/x86-unit-test
travis: unit test on 386
2016-07-15 09:43:31 -07:00
38d38f2635 travis: unit test on 386 2016-07-14 20:23:35 -07:00
1dfafd8fe0 test: separate phases of tests into configurable passes 2016-07-14 20:23:35 -07:00
b50d2395fd Merge pull request #5949 from heyitsanthony/fix-functester-failfast
etcd-tester: add FailFast(false) to grpc calls
2016-07-14 19:50:21 -07:00
0419d3ecf7 etcd-tester: add FailFast(false) to grpc calls 2016-07-14 19:16:41 -07:00
3e21d9f023 Merge pull request #5945 from Infra-Red/patch-bench
hack/benchmark: remove deprecated boom parameter
2016-07-14 18:44:48 -07:00
bf0be0fe5e Merge pull request #5948 from heyitsanthony/upgrade-grpc-cred-clobber
vendor: update grpc
2016-07-14 18:44:35 -07:00
b3f8490660 integration: add FailFast(false) to failing tests 2016-07-14 17:58:58 -07:00
d8f0ef0e80 clientv3: use grpc.FailFast(false) for all calls 2016-07-14 17:58:58 -07:00
d9a8a326df vendor: update grpc
Fixes #5871
2016-07-14 17:58:58 -07:00
07ed4da2ff integration: test grpc error equivalence with Error() 2016-07-14 17:58:49 -07:00
51c5c307fa rpctypes: test error equivalence with Error()
grpc.Errorf() now returns *rpcError, which makes comparisons shallow.
2016-07-14 15:59:06 -07:00
ee78f590ba hack/benchmark: remove deprecated boom parameter
Benchmark script will fail to run with -readall flag provided
2016-07-14 23:49:53 +03:00
575682f593 Merge pull request #5944 from heyitsanthony/mvcc-failpoints
build, backend: add backend commit failpoints
2016-07-14 13:08:25 -07:00
14d7dc940d Merge pull request #5943 from gyuho/pause-before-compaction-2
etcd-tester: pause before compaction, fix races, cleanups
2016-07-14 13:02:53 -07:00
ba2725c2d0 build, backend: add backend commit failpoints 2016-07-14 12:26:35 -07:00
ceb9fe4822 etcd-tester: stop stress before compact, fix races
fix race condition between stresser cancel, start
2016-07-14 12:16:42 -07:00
b0f2e5e64a Merge pull request #5927 from xiang90/pacing
*: deny proposals when there is a huge gap between apply/commit
2016-07-14 11:47:53 -07:00
8e59fb749c etcd-tester: increase default qps, fix cleanup 2016-07-14 11:20:16 -07:00
c0cc161ba8 Merge pull request #5937 from coreos/revert-5876-manual
Revert "Dockerfile: use 'ENTRYPOINT' instead of 'CMD'"
2016-07-14 10:35:21 -07:00
27b03f0ed5 *: deny proposals when there is a huge gap between apply/commit 2016-07-14 10:02:55 -07:00
35d379b052 Merge pull request #5934 from heyitsanthony/fix-publish-race
e2e: wait for every etcd server to publish to cluster
2016-07-13 19:22:08 -07:00
2f7da66d43 Revert "Dockerfile: use 'ENTRYPOINT' instead of 'CMD'" 2016-07-13 19:06:20 -07:00
6b487fb199 e2e: wait for every etcd server to publish to cluster
If etcdctl accesses the cluster before all members are published, it
will get an "unsupported protocol scheme" error. To fix, wait for both
the capabilities and published message.

Fixes #5824
2016-07-13 17:01:43 -07:00
3d109be3b4 Merge pull request #3621 from yichengq/usage-stderr
etcdmain: print usage in stderr when flag.Parse fail
2016-07-13 16:56:26 -07:00
071eac3838 Merge pull request #5918 from xiang90/init
etcdmain: only get initial cluster setting if the member is not initi…
2016-07-13 16:28:32 -07:00
c7881fddc2 Merge pull request #5933 from xiang90/doc
doc: better link for embed etcd
2016-07-13 16:27:32 -07:00
9bcf5a83fb doc: better link for embed etcd 2016-07-13 16:24:56 -07:00
c32dd164fe Merge pull request #5932 from heyitsanthony/nuke-etcdctl-v0.4
etcdctl: remove v0.4 support
2016-07-13 16:18:33 -07:00
8368e6a992 embed: only get initial cluster setting if the member is not init 2016-07-13 16:03:27 -07:00
97ff1abb3e etcdctl: remove 0.4 import command 2016-07-13 15:30:37 -07:00
439b96f090 etcdctl: remove 0.4 peer syncing 2016-07-13 15:26:25 -07:00
06fd46f835 Merge pull request #5928 from xiang90/err_code
rpctypes: use permission deny code for permission deny error
2016-07-13 15:00:02 -07:00
41a98dbd66 Merge pull request #5925 from heyitsanthony/embed-etcdmain
embeddable etcdmain
2016-07-13 13:51:19 -07:00
f6ef6157cc Documentation: link embedding etcd into docs 2016-07-13 13:28:11 -07:00
c0299ca6f4 integration: test embedded etcd 2016-07-13 10:40:03 -07:00
f4f33ea767 etcdmain, embed: export Config and StartEtcd into embed/
Lets programs embed etcd.

Fixes #5430
2016-07-13 10:40:03 -07:00
81d5ae3ce1 rpctypes: use permission deny code for permission deny error 2016-07-13 10:32:10 -07:00
7114a27345 Merge pull request #5922 from xiang90/l_b
tools/benchmark: add benchmark for lease keepalive
2016-07-12 10:48:04 -07:00
8273e1c07e tools/benchmark: add benchmark for lease keepalive 2016-07-12 10:40:56 -07:00
a243064e76 Merge pull request #5924 from xiang90/rm_04
integration: remove upgrade test for etcd0.4
2016-07-12 10:36:16 -07:00
6392ef5c44 integration: remove upgrade test for etcd0.4 2016-07-12 10:13:03 -07:00
7432e9fbe9 Merge pull request #5809 from swingbach/master
raft: make leader transferring workable when quorum check is on
2016-07-12 09:46:18 -07:00
b9f6de9277 Merge pull request #5895 from smallfish/master
etcdserver/api/v2http, Documentation: fix debug pprof index miss / in end
2016-07-12 07:10:53 -07:00
b2c1112288 Merge pull request #5921 from xiang90/r
raft: do not change RecentActive when resetState for progress
2016-07-12 06:54:14 -07:00
c36a40ca15 raft: introduce top-level context in message struct 2016-07-12 16:14:06 +08:00
eb08f2274e raft: do not change RecentActive when resetState for progress 2016-07-11 21:12:14 -07:00
cc26f2c889 Merge pull request #5913 from rboyer/correct-sample-peer-config-file
Correct security configuration for peers in sample config file.
2016-07-11 19:41:41 -07:00
4bc29e2b9c Merge pull request #5902 from mitake/bench-auth
tools: add --user for auth in benchmarks
2016-07-11 18:38:07 -07:00
8a21be721f Merge pull request #5919 from gyuho/raft-lead
raft: set leader id in stepFollower
2016-07-11 18:34:07 -07:00
7edb6bcbe1 etcd: correct security configuration for peers in sample config file 2016-07-11 20:19:27 -05:00
6f3a40cb53 raft: set leader id in stepFollower
Follower has already set its leader ID from
previous append messages from the leader, but
to be consistent,  this adds a line to set its
leader id from leader snapshot message.
2016-07-11 16:37:31 -07:00
ea0a569c4d Merge pull request #5917 from xiang90/rm
*: remove unnecessary data upgrade code
2016-07-11 15:38:07 -07:00
f65e75e4b3 *: remove unnecessary data upgrade code 2016-07-11 15:11:56 -07:00
c0f292e6b8 Merge pull request #5916 from xiang90/ctl
etcdctl: only takes 127.0.0.1:2379 as default endpoint
2016-07-11 13:34:55 -07:00
55ca788efe etcdctl: only takes 127.0.0.1:2379 as default endpoint 2016-07-11 13:28:02 -07:00
2b6f04a58e Merge pull request #5906 from gyuho/release-test
e2e: add basic upgrade tests
2016-07-11 12:42:54 -07:00
a3347e3e68 Merge pull request #5915 from heyitsanthony/doc-new-platform
Documentation: clarify support policy
2016-07-11 12:41:38 -07:00
5b0d52f8c3 Documentation: clarify support policy 2016-07-11 12:10:17 -07:00
e8e561e8f5 e2e: add basic upgrade tests 2016-07-11 11:28:04 -07:00
e5b5cf02d3 test: add upgrade test flag 2016-07-11 11:10:24 -07:00
0d9b6ba0ab raft: fix a few problems 2016-07-11 14:59:53 +08:00
da44e17b58 Merge pull request #5908 from gyuho/raft-cleanup
raft: remove unnecessary type-cast, else-clause
2016-07-10 16:42:42 -07:00
c396b6aaaa raft: remove unnecessary type-cast, else-clause 2016-07-09 22:01:19 -07:00
c47689d98f Merge pull request #5689 from mitake/skip-apply
RFC: etcdserver, pkg: skip needless log entry applying
2016-07-10 01:23:35 +09:00
474eb1b44b Merge pull request #5890 from jaredeh/32bit
Easy 32bit architecture fixes
2016-07-08 13:36:52 -07:00
f78d4713ea etcdserver: atomic access alignment
Most fields accessed with sync/atomic functions are 64bit aligned, but a couple
are not.  This makes comments out of date and therefore misleading.

Affected fields reordered, comments scrubbed and updated.
2016-07-08 11:20:47 -07:00
90889ebc0f raftpb: atomic access alignment
The Entry struct has misaligned fields that are accessed atomically.  The
misalignment is caused by the EntryType enum which the Protocol Buffers
spec forces to be a 32bit int.

Moving the order of the fields without renumbering them in the .proto file
seems to align the go structure without changing the wire format.
2016-07-08 11:13:53 -07:00
df94f58462 raft: atomic access alignment
The relevant structures are properly aligned, however, there is no comment
highlighting the need to keep it aligned as is present elsewhere in the
codebase.

Adding note to keep alignment, in line with similar comments in the codebase.
2016-07-08 11:05:41 -07:00
eded9f5f84 Merge pull request #5887 from gyuho/rate-limiting-stresser
etcd-tester: add rate limits to stresser
2016-07-08 09:32:25 -07:00
a153448b84 tools: add --user for auth in benchmarks
This commit adds --user for auth in benchmarks. Its purpose is
measuring overhead of authentication of v3 API. Of course the given
user must be granted permission of target keys before benchmarking.

Example of a case with no authentication:
% ./benchmark range k1
bench with linearizable range
 10000 / 10000 Booooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo! 100.00%2m10s

Summary:
  Total:        130.1850 secs.
  Slowest:      0.4071 secs.
  Fastest:      0.0064 secs.
  Average:      0.0130 secs.
  Stddev:       0.0079 secs.
  Requests/sec: 76.8138

Response time histogram:
  0.006 [1]     |
  0.046 [9990]  |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.087 [3]     |
  0.127 [0]     |
  0.167 [3]     |
  0.207 [2]     |
  0.247 [0]     |
  0.287 [0]     |
  0.327 [0]     |
  0.367 [0]     |
  0.407 [1]     |

Latency distribution:
  10% in 0.0076 secs.
  25% in 0.0086 secs.
  50% in 0.0113 secs.
  75% in 0.0146 secs.
  90% in 0.0209 secs.
  95% in 0.0272 secs.
  99% in 0.0344 secs.

Example of a case with authentication:
% ./benchmark --user=u1:p range k1
bench with linearizable range
 10000 / 10000 Booooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo! 100.00%2m11s

Summary:
  Total:        131.4923 secs.
  Slowest:      0.1637 secs.
  Fastest:      0.0065 secs.
  Average:      0.0131 secs.
  Stddev:       0.0070 secs.
  Requests/sec: 76.0501

Response time histogram:
  0.006 [1]     |
  0.022 [9075]  |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.038 [875]   |∎∎∎
  0.054 [36]    |
  0.069 [5]     |
  0.085 [1]     |
  0.101 [1]     |
  0.117 [0]     |
  0.132 [0]     |
  0.148 [5]     |
  0.164 [1]     |

Latency distribution:
  10% in 0.0076 secs.
  25% in 0.0087 secs.
  50% in 0.0114 secs.
  75% in 0.0150 secs.
  90% in 0.0215 secs.
  95% in 0.0272 secs.
  99% in 0.0347 secs.

It seems that current auth mechanism does not introduce visible overhead.
2016-07-08 16:53:05 +09:00
abb20ec51f etcdserver, pkg: skip needless log entry applying
This commit lets etcdserver skip needless log entry applying. If the
result of log applying isn't required by the node (client that issued
the request isn't talking with the node) and the operation has no side
effects, applying can be skipped.

It would contribute to reduce disk I/O on followers and be useful for
a cluster that processes much serializable get.
2016-07-08 15:16:45 +09:00
7c39f41e7c etcd-tester: add rate limiter to stresser 2016-07-07 21:55:12 -07:00
b970e03e19 Merge pull request #5446 from gyuho/gateway_log
tcpproxy: log proxy start
2016-07-07 21:38:01 -07:00
ce8900e3b4 Merge pull request #5899 from heyitsanthony/qos-tuning
Documentation: tuning advice for peer prioritization
2016-07-07 19:48:07 -07:00
e6d15b966c etcdserver/api/v2http, Documentation: fix debug pprof index miss / in end 2016-07-08 10:21:05 +08:00
6f0a67603a Documentation: tuning advice for peer prioritization 2016-07-07 19:14:31 -07:00
a2760c9f49 Merge pull request #5888 from heyitsanthony/v2-one-shot
client: make set/delete one shot operations
2016-07-07 16:41:01 -07:00
c30f89f1d0 client/integration: test v2 client one shot operations 2016-07-07 15:58:58 -07:00
946b3cce1d client: make set/delete one shot operations
Old behavior would retry set and delete even if there's an error. This
can lead to the client returning an error for deleting twice, instead
of returning an error for an interdeterminate state.

Fixes #5832
2016-07-07 15:51:08 -07:00
4f2da16d82 Merge pull request #5897 from xiang90/lock
v3rpc: lock progress and prevKV map correctly
2016-07-07 15:20:37 -07:00
427496ebb8 v3rpc: lock progress and prevKV map correctly 2016-07-07 15:01:05 -07:00
dc2dced129 Merge pull request #5892 from heyitsanthony/auth-cheap-bcrypt
auth: cheap bcrypt for tests
2016-07-07 09:04:57 -07:00
b6a497214e Merge pull request #5883 from westhood/master
clientv3: fix sync base
2016-07-07 07:09:55 -07:00
0b0cbaac09 clientv3: use cheap bcrypt for ExampleAuth and use embedded auth api
Fixes #5783
2016-07-06 23:35:14 -07:00
d4e0e419dc auth: set bcrypt cost to minimum for test cases
DefaultCost makes auth tests 10x more expensive than MinCost.

Fixes #5851
2016-07-06 23:35:06 -07:00
16b0c1d1e1 clientv3: fix sync base
It is not correct to use WithPrefix. Range end will change in every
internal batch.
2016-07-07 12:02:53 +08:00
88a9cf2cea clientv3: add public function to get prefix range end 2016-07-07 10:35:40 +08:00
c4a280e511 Merge pull request #5881 from goby/master
hack: fix etcd execute path in k8s example
2016-07-06 17:04:16 -07:00
244b1d7d20 tcpproxy: add start logging line 2016-07-06 14:21:26 -07:00
1c9e0a0e33 Merge pull request #5886 from heyitsanthony/health-check-str
rafthttp: make health check meaning clearer
2016-07-06 11:32:27 -07:00
4db8f018cb Merge pull request #5885 from xiang90/fix_snap_test
etcdserver: fix TestSnap
2016-07-06 11:21:13 -07:00
3a080143a7 rafthttp: make health check meaning clearer 2016-07-06 10:31:13 -07:00
3451623c71 etcdserver: fix TestSnap 2016-07-06 10:30:15 -07:00
8c4df9a96f hack: fix etcd execute path in k8s example
Change /etcd to /usr/local/bin/etcd
2016-07-06 15:07:00 +08:00
234c30c061 Merge pull request #5880 from xiang90/put_prev
add options to return prev_kv
2016-07-05 21:03:56 -07:00
7ec822107a *: add put prevkv 2016-07-05 20:45:01 -07:00
12bf1a3382 *: rename preserveKVs to prevKv 2016-07-05 20:45:01 -07:00
a78cdeae81 Merge pull request #5877 from heyitsanthony/rsa-fixtures
test: certificate fixes for fedora
2016-07-05 19:34:23 -07:00
929d6ab62c Merge pull request #5850 from xiang90/get_o_kv
*: support get-old-kv in watch
2016-07-05 16:37:24 -07:00
c853704ac9 *: support get-old-kv in watch 2016-07-05 16:17:09 -07:00
c642430fae integration: use RSA certs for testing
Some systems don't support EC due to patent issues, but the tests
should still work.

Fixes #5744
2016-07-05 13:21:21 -07:00
066afd6abd Merge pull request #5876 from gyuho/manual
Dockerfile: use 'ENTRYPOINT' instead of 'CMD'
2016-07-05 11:40:02 -07:00
f19cef960e Dockerfile: use 'ENTRYPOINT' instead of 'CMD'
use entrypoint, so people can specify flags to etcd
without providing the binary.

Signed-off-by: Secret <haichuang221@163.com>
2016-07-05 11:28:19 -07:00
beab76c7a9 Merge pull request #5872 from gyuho/build_doc
Documentation: add instruction on vendoring, build
2016-07-05 10:43:28 -07:00
ff5ddd0909 Documentation: add instruction on vendoring, build
Addressing https://github.com/coreos/etcd/issues/5857#issuecomment-230174840.
2016-07-05 09:55:44 -07:00
660f0fcc3d Merge pull request #5873 from gyuho/raft_updates
raft: fix minor grammar, remove TODO
2016-07-05 09:49:08 -07:00
8c71eb71df Merge pull request #5867 from vmatekole/master
Documentation: Example config amendment
2016-07-05 09:17:22 -07:00
c52bf1ac5d Documentation: Example config amendment 2016-07-05 16:27:47 +02:00
9e0de02fde raft: fix minor grammar, remove TODO
- test 'Term' panic cases (remove TODO)
- fix minor grammar in 'Node' godoc
2016-07-05 07:21:52 -07:00
c7dd74d8d3 Merge pull request #5869 from gyuho/raft_log_test
raft: minor updates and clean up in log.go
2016-07-04 21:51:13 -07:00
881a120453 raft: minor updates and clean up in log.go
- remove redundant test case in log_test.go
- fix test case comment ('equal or larger')
- lastnewi after matching index and term
2016-07-04 16:52:17 -07:00
b566ca225c Merge pull request #5855 from heyitsanthony/fix-windows-wal-init
wal: release wal locks before renaming directory on init
2016-07-03 19:21:23 -07:00
8d99a666f9 Merge pull request #5854 from xiang90/r_f
raft: add features section to readme file
2016-07-03 18:00:31 -07:00
c76dcc5190 raft: add features section to readme file 2016-07-03 17:59:59 -07:00
df61322e5b Merge pull request #5862 from xiang90/fix_sn
etcdserver: commit before sending snapshot
2016-07-03 15:30:20 -07:00
7cb61af245 Merge pull request #5864 from gyuho/raft_cleanup
raft: remove unnecessary reflect.DeepEqual in test
2016-07-03 14:03:08 -07:00
70bf768005 Merge pull request #5861 from xiang90/fix_watch
v3rpc: do not panic on user error for watch
2016-07-03 13:56:33 -07:00
8a8a8253fa etcdserver: commit before sending snapshot 2016-07-03 13:54:05 -07:00
9b5e99efe0 raft: remove unnecessary reflect.DeepEqual in test 2016-07-03 13:42:26 -07:00
13a4056327 v3rpc: do not panic on user error for watch 2016-07-03 08:57:48 -07:00
5991209c2d wal: release wal locks before renaming directory on init
Fixes #5852
2016-07-02 12:14:37 -07:00
7cc4596ebd Merge pull request #5849 from heyitsanthony/fix-compactor-test-races
compactor: make tests deterministic
2016-07-01 23:07:36 -07:00
9405583745 Merge pull request #5830 from heyitsanthony/functest-failpoints
functional-tester: failpoint support
2016-07-01 16:58:36 -07:00
1af7c400d1 compactor: make tests deterministic
Fixes #5847
2016-07-01 16:50:05 -07:00
a5f043c85b etcd-tester: add failpoint cases
Fixes #5754
2016-07-01 15:31:49 -07:00
c6a3048e81 Merge pull request #5848 from gyuho/cluster_version
etcdserver/api: print only major.minor version API
2016-07-01 15:19:08 -07:00
ba023e539a etcdserver/api: print only major.minor version API
Before

2016-07-01 14:57:50.927170 I | api: enabled capabilities for version 3.0.0

After

2016-07-01 14:57:50.927170 I | api: enabled capabilities for version 3.0
2016-07-01 14:58:06 -07:00
c8c5f41a01 Merge pull request #5836 from xiang90/better_d_prev
*: support return prev deleted kv
2016-07-01 14:43:33 -07:00
8d4701bb1d etcd-agent: enable GOFAIL_HTTP endpoint 2016-07-01 14:39:48 -07:00
40c4a7894d *: support return prev deleted kv 2016-07-01 14:01:48 -07:00
ab6f49dc67 Merge pull request #5844 from gyuho/go_version
*: test, docs with go1.6+
2016-07-01 11:27:51 -07:00
a53f538f27 *: test, docs with go1.6+
etcd v3 uses http/2, which doesn't work well with go1.5
2016-07-01 11:16:38 -07:00
d163aefc1a Merge pull request #5823 from davygeek/configcheck
*: fixed some  warning
2016-07-01 10:28:59 -07:00
bf0ab6a2df Merge pull request #5843 from gyuho/manual
Documentation: fix typo in api_grpc_gateway.md
2016-07-01 10:21:45 -07:00
c7a0830a62 Merge pull request #5841 from heyitsanthony/fix-be-semver
etcdserver: exit on missing backend only if semver is >= 3.0.0
2016-07-01 10:14:05 -07:00
b3464a918b Documentation: fix typo in api_grpc_gateway.md 2016-07-01 10:07:14 -07:00
b7f5f8fc99 etcdserver: exit on missing backend only if semver is >= 3.0.0 2016-07-01 09:10:01 -07:00
581f847e06 Merge pull request #5829 from gyuho/ftest
etcd-tester: fix slow leader with injectLatency
2016-06-30 13:46:13 -07:00
0d44947c11 etcd-tester: fix slow leader with injectLatency 2016-06-30 13:41:27 -07:00
78b143b800 Merge pull request #5828 from gyuho/docker
release: fix Dockerfile etcd binary paths
2016-06-30 12:25:51 -07:00
a2f6ec3128 release: fix Dockerfile etcd binary paths
release script uses binary files in 'release/image-docker',
not the ones in "bin/". Tested with v3.0.0 release.
2016-06-30 11:47:33 -07:00
c68d60c99f Merge pull request #5827 from gyuho/version
*: clean up beta in docs, bump to 3.0.0+git
2016-06-30 10:03:28 -07:00
4cd834910e version: bump to v3.0.0+git 2016-06-30 09:43:10 -07:00
cb1a1426b1 *: remove beta from docs 2016-06-30 09:39:52 -07:00
04a9141e45 Merge pull request #5825 from sofuture/jeff/tls-setup-fixes
hack/tls-setup minor fixes
2016-06-30 09:29:41 -07:00
548360b140 Merge pull request #5826 from gyuho/back
Doc: fix typo in dev-guide.md
2016-06-30 09:20:07 -07:00
8ce7481a7f Doc: fix typo in dev-guide.md 2016-06-30 09:14:03 -07:00
74d75a96eb hack: install goreman in tls-setup example 2016-06-30 10:11:47 -06:00
0938c861f0 hack: add tls-setup example generated certs to gitignore 2016-06-30 10:11:28 -06:00
8c96d2573f *: fixed some warning 2016-06-30 23:13:46 +08:00
ad556b7e7d Merge pull request #5821 from davygeek/discovery
discovery: Uniform code style
2016-06-30 07:19:08 -07:00
ea0eab84a4 discovery: Uniform code style 2016-06-30 22:00:01 +08:00
5f4d1c8891 Merge pull request #5819 from gyuho/tester_fix
etcd-tester: handle error in RevHash
2016-06-29 19:36:08 -07:00
dc49016987 etcd-tester: handle error in RevHash 2016-06-29 19:31:45 -07:00
0e137e21bc Merge pull request #5817 from gyuho/ctl_consistent
ctlv3: make flags, commands formats consistent
2016-06-29 16:15:54 -07:00
9b47ca5972 ctlv3: make flags, commands formats consistent
1. Capitalize first letter
2. Remove period at the end

(followed the pattern in linux coreutil man page)
2016-06-29 15:52:06 -07:00
3b80df7f4e Merge pull request #5814 from heyitsanthony/functest-refactor
etcd-tester: refactor cluster and failure code
2016-06-29 15:25:55 -07:00
b7d0497c47 Merge pull request #5807 from xiang90/gproxy
*: initial implementation of grpc-proxy
2016-06-29 13:28:57 -07:00
150321f5ac Merge pull request #5815 from gyuho/raft_test_fix
raft: give correct offset in unstable test
2016-06-29 13:27:14 -07:00
2cc2372165 raft: give correct offset in unstable test
`unstable.entries[i] has raft log position i+unstable.offset`

So, this fixes some test cases by giving them correct
offsets.
2016-06-29 12:29:36 -07:00
6d8c647db8 *: initial implementation of grpc-proxy 2016-06-29 12:06:04 -07:00
5f459a64ce etcd-tester: refactor cluster member handling 2016-06-29 11:25:33 -07:00
402df5bd03 etcd-tester: refactor failure code to reduce code duplication 2016-06-29 11:03:34 -07:00
63f78bf7c8 etcd-tester: refactor round loop 2016-06-29 11:03:34 -07:00
66d195ff75 Merge pull request #5813 from nekto0n/encoder-pointer
rafthttp: use pointers to avoid extra copies upon message encoding
2016-06-29 10:01:27 -07:00
ff908b4ba8 Merge pull request #5812 from heyitsanthony/test-merge-base
test: use merge-base for commit title checking
2016-06-29 09:50:26 -07:00
dc218fb41d test: use merge-base for commit title checking
Otherwise, will compare branch with forked master against upstream master.
2016-06-29 09:28:53 -07:00
fd5bc21522 rafthttp: use pointers to avoid extra copies upon message encoding 2016-06-29 21:17:18 +05:00
7f3b2e23a4 Merge pull request #5811 from davygeek/golintnotice
client: follow golint notice change errors.New to fmt.Errorf
2016-06-29 09:12:49 -07:00
e020b2a228 raft: make leader transferring workable when quorum check is on 2016-06-29 18:24:58 +08:00
8e9097d0c0 Merge pull request #5748 from mitake/auth-disable
disabling auth in v3 API
2016-06-28 22:32:44 -07:00
3b91648070 client: follow golint notice change errors.New to fmt.Errorf and use 'var eps []string' instead 'var make([]string, 0)' 2016-06-29 13:21:49 +08:00
a4667cb863 Merge pull request #5700 from mqliang/proxy-compact
proxy: implement compaction function
2016-06-28 20:49:01 -07:00
2e2f405b1e proxy:replace c with client to improve readability 2016-06-29 11:30:03 +08:00
f28a87d835 proxy: implement compaction 2016-06-29 11:28:10 +08:00
83d9ce3d7c Merge pull request #5803 from gyuho/manual
raft: simplify truncateAndAppend
2016-06-28 20:07:35 -07:00
8f8ff4d519 Merge pull request #5805 from gyuho/ftest
etcd-tester: match ErrTimeout in stresser
2016-06-28 19:51:21 -07:00
745e1e2cf9 e2e: enhance the test case of auth disabling 2016-06-29 11:31:42 +09:00
15f2fd0726 etcd-tester: match ErrTimeout in stresser
Fix https://github.com/coreos/etcd/issues/5804.
2016-06-28 19:20:28 -07:00
df31eab136 raft: simplify truncateAndAppend
truncateAndAppend no need the value of 'after' with subbing one
2016-06-28 18:53:12 -07:00
66107b8653 auth: invalidate every token in disabling auth 2016-06-29 10:31:46 +09:00
8e825de35f Merge pull request #5513 from vikstrous/clustererror
improve error message for ClusterError
2016-06-28 18:15:35 -07:00
8216fdc59f Merge pull request #5799 from xiang90/grpcnaming
clientv3: add grpc naming resolver
2016-06-28 17:51:23 -07:00
4f57bb313f clientv3: add grpc naming resolver 2016-06-28 17:06:58 -07:00
1b2f025414 Merge pull request #5801 from heyitsanthony/fix-watch-closeerr-race
clientv3: only use return closeErr when donec is closed
2016-06-28 17:05:48 -07:00
1db4ee8c61 clientv3: only use closeErr on watch when donec is closed
Fixes #5800
2016-06-28 16:14:09 -07:00
81322b498e Merge pull request #5798 from gyuho/tests_more
Tests other projects to ensure it compiles
2016-06-28 15:22:32 -07:00
ede0b584b8 test: test builds on other projects 2016-06-28 15:03:19 -07:00
da180e0790 Merge pull request #5796 from gyuho/bench
benchmark: fix Compact request
2016-06-28 14:14:07 -07:00
bc6d7659af Merge pull request #5795 from xiang90/filter
*: support watch with filters
2016-06-28 14:07:12 -07:00
ae057ec508 benchmark: fix Compact request 2016-06-28 13:58:28 -07:00
dced92f8bd *: support watch with filters
Now user can filter events with types. The API is also extensible.
It might make sense for the proxy to filter out events based on
more expensive/customized filter.
2016-06-28 13:46:57 -07:00
5f1c763993 Merge pull request #5553 from swingbach/master
raft: implemented read-only query when quorum check is on
2016-06-28 12:47:43 -07:00
ddffdc3e37 Merge pull request #5725 from mitake/auth-not-enabled
auth, etcdserver: let Authenticate() fail if auth isn't enabled
2016-06-28 12:34:54 -07:00
ec232ec9d8 Merge pull request #5787 from heyitsanthony/compact-resp
clientv3, ctl3, clientv3/integration: add compact response to compact
2016-06-28 12:27:42 -07:00
38035c8c13 Merge pull request #5794 from xiang90/fix_c
mvcc: do not hash consistent index
2016-06-28 12:25:32 -07:00
ef9754910e mvcc: do not hash consistent index 2016-06-28 09:36:26 -07:00
1c25aa6c48 clientv3, ctl3, clientv3/integration: add compact response to compact 2016-06-28 09:32:31 -07:00
0cd5c658aa Merge pull request #5788 from gyuho/retry_tester
etcd-tester: match ErrTimeoutDueToLeaderFail
2016-06-27 21:16:07 -07:00
ac68f70843 etcd-tester: match ErrTimeoutDueToLeaderFail
stresser in followers should retry when failure is injected to
their leader.
2016-06-27 20:48:06 -07:00
0faae33ace raft: implemented read-only query when quorum check is on 2016-06-28 10:52:53 +08:00
c00e97ea49 Merge pull request #5785 from gyuho/doc_update
Documentation/upgrades: upgrade 3.0 doc
2016-06-27 15:46:53 -07:00
c8a7d281ee Documentation/upgrades: upgrade 3.0 doc 2016-06-27 15:45:44 -07:00
a5c2cd2708 Merge pull request #5784 from heyitsanthony/doc-todos
Documentation: clear out some TODOs
2016-06-27 15:20:30 -07:00
11fdf2dd18 Documentation: clear out some TODOs 2016-06-27 15:00:18 -07:00
3b300f42e9 Merge pull request #5781 from gyuho/compact_client
*: compact with physical in client side
2016-06-27 14:46:54 -07:00
b4162f8a45 Merge pull request #5782 from heyitsanthony/doc-lint
Documentation: conform to header style
2016-06-27 12:20:31 -07:00
f63e6875bd e2e: test 'physical' flag in compact cmd 2016-06-27 12:07:49 -07:00
76e2bf03b8 etcdctl: v3 compact with physical flag 2016-06-27 12:07:46 -07:00
859e336d68 clientv3: configurable physical in compact 2016-06-27 12:04:04 -07:00
35229eb2d3 Documentation: conform to header style 2016-06-27 12:00:24 -07:00
bbed3ecc8d Merge pull request #5780 from xiang90/check_i
etcdserver: check index of the kv when restarting
2016-06-27 11:44:10 -07:00
9614dc6e71 etcdserver: check index of the kv when restarting 2016-06-27 10:27:27 -07:00
8df37d53d6 auth, etcdserver: let Authenticate() fail if auth isn't enabled
Successful Authenticate() would be confusing and make trouble shooting
harder if auth isn't enabled in a cluster.
2016-06-26 22:49:23 -07:00
aab905f7cc Merge pull request #5776 from mitake/commit-title
test: more accurate checking of commit title
2016-06-27 14:48:10 +09:00
cfc171d5f7 Merge pull request #5777 from xiang90/c_be
etcdserver: refuse to restart if backend file is missing
2016-06-26 22:34:08 -07:00
891ddcba6e etcdserver: refuse to restart if backend file is missing 2016-06-26 21:16:51 -07:00
555028f3d1 test: more accurate checking of commit title 2016-06-26 21:05:47 -07:00
efcf03f0b1 Merge pull request #5773 from heyitsanthony/integration-unixsock
integration: use unix sockets for all connections
2016-06-25 16:17:20 -07:00
ae6e879812 Merge pull request #5774 from gyuho/raft_minor_fix
raft: len(entries) before Lock, use firstIndex
2016-06-25 11:13:16 -07:00
6a48961895 raft: len(entries) before Lock, use firstIndex
- To avoid unnecessary locking in case len(entries) == 0
- use firstIndex method
2016-06-24 23:50:00 -07:00
13d0ea7f54 integration: use unix domain sockets for all connections 2016-06-24 21:18:19 -07:00
bbb84ff709 discovery: use pkg/transport to create http transport 2016-06-24 21:04:39 -07:00
54d56e2531 pkg/types: accept unix and unixs schemes 2016-06-24 21:04:39 -07:00
fc1a226d15 pkg/transport: unix domain socket listener and transport 2016-06-24 21:04:31 -07:00
0c820dc7ba Merge pull request #5769 from xiang90/unstable
doc: add unstable
2016-06-24 13:38:54 -07:00
40f62ab4a5 Merge pull request #5771 from gyuho/docker
*: separate Dockerfile for quay build trigger
2016-06-24 13:26:05 -07:00
0da05a896f doc: add experimental apis 2016-06-24 13:05:53 -07:00
8a71f749d7 *: separate Dockerfile for quay build trigger
Fix https://quay.io/repository/coreos/etcd-git/build/d75d80b1-7d8d-42bd-af07-645b7da3a118.
2016-06-24 12:55:10 -07:00
3424f95b03 Merge pull request #5770 from gyuho/op_guide
*: move 'Project detail' to op-guide
2016-06-24 10:50:03 -07:00
862b3fe2be *: move 'Project detail' to op-guide 2016-06-24 10:47:12 -07:00
aeb5b3c82b Merge pull request #5766 from heyitsanthony/eschew-you
doc: remove you/your from current docs
2016-06-24 09:29:26 -07:00
e1b9ccb1d7 doc: eschew "you" for current docs 2016-06-24 09:28:12 -07:00
d284a45a4b Merge pull request #5765 from heyitsanthony/autotls-security
doc: auto-tls example in security guide
2016-06-24 09:17:38 -07:00
9bde740cf9 doc: auto-tls example in security guide 2016-06-24 09:15:46 -07:00
c1d2149a0f Merge pull request #5767 from mitake/build
build: remove needless output
2016-06-24 09:03:14 -07:00
15b267fbfd Merge pull request #5768 from gyuho/raft_comment
raft: fix comment, method name in progress
2016-06-24 08:38:02 -07:00
33f7e7583b raft: fix comment,method name to needSnapshotAbort
And 'maybeSnapshotAbort' does not 'unset'
the pendingSnapshot. 'resetState', which is called after this
metho, is the one that unsets pendingSnapshot. So this changes
the method name.
2016-06-24 07:54:10 -07:00
abc1cb945b build: remove needless output
Current build script outputs its name to stdout because of its
checking argv[0].

$ ./build
./build

The line is a little bit mysterious so this commit removes it.
2016-06-24 13:54:53 +09:00
a7189ef073 Merge pull request #5762 from gyuho/member_auth
Documentation/demo: add member, auth example
2016-06-23 16:10:57 -07:00
78d9ae1820 Merge pull request #5763 from heyitsanthony/local-tester-fp
local-tester: support failpoints
2016-06-23 13:01:26 -07:00
9b4dc92fdc Merge pull request #5761 from xiang90/proxy_v2
*: make it clear that proxy only supports v2 api now
2016-06-23 12:35:04 -07:00
755d192ff7 *: make it clear that proxy only supports v2 api now 2016-06-23 12:06:42 -07:00
244266708b local-tester: support failpoints 2016-06-23 12:04:11 -07:00
b2a8acdf10 Documentation/demo: add member, auth example 2016-06-23 11:50:37 -07:00
9664df1b5e Merge pull request #5760 from gyuho/peer-urls
*: change ctlv3 flag peerURLs to 'peer-urls'
2016-06-23 10:12:28 -07:00
f9d250ad1b e2e: update flag to 'peer-urls' 2016-06-23 09:53:30 -07:00
fa74a0d3bb etcdctl: change peerURLs flag to 'peer-urls' 2016-06-23 09:52:25 -07:00
c949811752 Merge pull request #5758 from dannysauer/master
index is incremented in Watcher; remove double-increment
2016-06-23 07:57:21 -07:00
5247702d8d Merge pull request #5755 from nekto0n/reuse-timer
Reuse timer in backend.run.
2016-06-23 07:28:09 -07:00
a998fb4af1 etcdctl: index is incremented in Watcher; remove double-increment 2016-06-23 08:54:34 -05:00
dbc7c2cf4e backend: reuse timer in run().
Benchmarks:

```
import (
	"testing"
	"time"
)

func BenchmarkTimeAfter(b *testing.B) {
	b.ReportAllocs()
	for n := 0; n < b.N; n++ {
		select {
		case <- time.After(1 * time.Millisecond):
		}
	}
}

func BenchmarkTimerReset(b *testing.B) {
	b.ReportAllocs()
	t := time.NewTimer(1 * time.Millisecond)
	for n := 0; n < b.N; n++ {
		select {
		case <- t.C:
		}
		t.Reset(1 * time.Millisecond)
	}
}
```

Running reveals that each loop results in 3 allocs:

```
BenchmarkTimeAfter-4 	    2000	   1112134 ns/op	     192 B/op	       3 allocs/op
BenchmarkTimerReset-4	    2000	   1109774 ns/op	       0 B/op	       0 allocs/op
```
2016-06-23 18:49:41 +05:00
b945a3fcc8 Merge pull request #5753 from gyuho/example
clientv3: add auth example
2016-06-22 20:27:30 -07:00
2da5bdd4df clientv3: add auth example 2016-06-22 20:06:13 -07:00
e4ab1540c8 Merge pull request #5752 from gyuho/mkdir
Make mkdir consistent
2016-06-22 16:16:38 -07:00
4a0f922a6c pkg/transport: use TouchDirAll 2016-06-22 15:57:55 -07:00
6cfc03a5f9 wal: use CreateDirAll 2016-06-22 15:57:55 -07:00
c363fd288b etcdserver: use CreateDirAll 2016-06-22 15:57:47 -07:00
5720fe812e etcdctl: use CreateDirAll 2016-06-22 15:55:56 -07:00
187faba3e0 pkg/fileutil: fix TouchDirAll, add CreateDirAll
os.MkdirAll never returns os.ErrExist.
And add another function to ensure deepest
directory is empty.
2016-06-22 15:54:17 -07:00
df9a52e53f Merge pull request #5702 from gyuho/vet
*: go vet, go lint fixes
2016-06-22 14:52:34 -07:00
6fbf8be3ac Merge pull request #5751 from heyitsanthony/fail-bad-commit-msg
test: check commit titles
2016-06-22 14:03:15 -07:00
b7253992d4 test: check commit titles 2016-06-22 13:30:22 -07:00
da85108ca2 client: improve error message for ClusterError 2016-06-22 13:13:12 -07:00
c1e3601776 raftexample: fixes from go vet, go lint 2016-06-22 12:04:15 -07:00
e221699fd8 rafthttp: fix from go vet, go lint 2016-06-22 12:04:15 -07:00
725ded40f7 etcdserver: fix from go vet, go lint 2016-06-22 12:04:15 -07:00
e2138179e3 client: fix from go vet, go lint 2016-06-22 12:04:15 -07:00
6557ef7cd8 *: copy all exported members in tls.Config
Without this, go vet complains

assignment copies lock value to n: crypto/tls.Config contains sync.Once
contains sync.Mutex
2016-06-22 12:04:08 -07:00
84c416491e Merge pull request #5739 from heyitsanthony/serialize-txn
etcdserver: make serialized txns auth-aware
2016-06-22 11:49:56 -07:00
caffcb7fbb *: go vet fix in go tip 2016-06-22 11:10:59 -07:00
30cfa30490 etcdserver: make serialized txns auth-aware 2016-06-22 10:51:42 -07:00
aafb2e9430 etcdserver: add lock to authApplier so serialized requests don't race 2016-06-22 10:51:42 -07:00
27ef4baa9c Merge pull request #5749 from gyuho/manual
*: misc typos and go vet fixes
2016-06-22 10:45:02 -07:00
6480066054 *: misc typos and go vet fixes 2016-06-22 10:32:13 -07:00
8d259d3cf1 Merge pull request #5745 from xiang90/count_client
clientv3: add withCount support
2016-06-22 10:04:06 -07:00
82991074bf Merge pull request #5733 from mitake/user-detail
etcdctl: a flag for getting detailed information of a user
2016-06-22 09:26:00 -07:00
0e7690780f etcdctl: a flag for getting detailed information of a user
This commit adds a new flag --detail to etcdctl user get command. The
flag enables printing the detailed permission information of the user
like below example:

$ ETCDCTL_API=3 bin/etcdctl --user root:p user get u1
User: u1
Roles: r1 r2
$ ETCDCTL_API=3 bin/etcdctl --user root:p user get u1 --detail
User: u1

Role r1
KV Read:
        [k1, k5)
KV Write:
        [k1, k5)

Role r2
KV Read:
        a
        b
        [k8, k9)
KV Write:
        a
        b
        [k8, k9)
2016-06-22 13:29:48 +09:00
6496ae005d clientv3: add withCount support 2016-06-21 21:17:35 -07:00
0b5ea3ec94 Merge pull request #5742 from xiang90/count
*: support count in range query
2016-06-21 19:42:08 -07:00
def21f11a9 *: support count in range query 2016-06-21 16:20:55 -07:00
5a6ad1ea76 Merge pull request #5738 from heyitsanthony/fp
build with failpoints
2016-06-21 15:02:45 -07:00
de68818f03 etcdserver: add some failpoints 2016-06-21 14:43:20 -07:00
7f8ffd7dbe test, build: support failpoints 2016-06-21 14:43:20 -07:00
6009e88077 test, build: make build script source-able without doing a build 2016-06-21 14:35:20 -07:00
99957e9831 Merge pull request #5736 from gyuho/cleanup
etcdctl/ctlv3: minor clean ups
2016-06-21 13:31:20 -07:00
80aa5978ca etcdctl/ctlv3: minor clean ups
- Fix typo
- Improve command ordering (elect should be below lock)
- Update migrate command description
2016-06-21 13:12:01 -07:00
c01c36bcfd Merge pull request #5735 from gyuho/auth_doc
etcdctl/ctlv3: document auth,user,role
2016-06-21 12:49:31 -07:00
e5d9ca5180 etcdctl/ctlv3: document auth,user,role 2016-06-21 12:46:42 -07:00
22bae02fe5 Merge pull request #5734 from xiang90/learning
doc: move docs to learning
2016-06-21 11:02:04 -07:00
7c12949b41 doc: move docs to learning 2016-06-21 10:49:46 -07:00
1b8e83ae60 Merge pull request #5732 from mitake/e2e-user-role-dyn-update
e2e: add test cases for updating user and role during operations
2016-06-21 09:54:18 -07:00
4106e56d91 e2e: check role revoking during operations 2016-06-21 15:52:36 +09:00
68bcbdc84e e2e: check user deletion during operations 2016-06-21 15:03:04 +09:00
d017814eaa Merge pull request #5722 from mitake/auth-v3-check-test
e2e: check runtime permission changing
2016-06-20 22:42:43 -07:00
8920e7c4d5 Merge pull request #5731 from gyuho/grpc_log
*: use capnslog for grpclog
2016-06-20 20:35:28 -07:00
a1c7a7df5e *: use capnslog for grpclog 2016-06-20 20:35:03 -07:00
6fe4d9d30a e2e: check runtime permission changing
This commit adds extends the test for checking runtime permission
grant/revoke.
2016-06-21 11:55:09 +09:00
6d81601df3 vendor: update capnslog 2016-06-20 19:39:15 -07:00
0cc59f3976 Merge pull request #5730 from gyuho/cli_dep
*: codegangsta/cli to urfave/cli
2016-06-20 16:54:15 -07:00
bdca594495 etcdctl/ctlv2: use latest Action interface 2016-06-20 16:34:28 -07:00
1e0ff8555e Merge pull request #5729 from xiang90/fix_bench
benchmark: fix watch bench
2016-06-20 16:02:09 -07:00
0ae9d444f9 ctlv2: use urfave/cli in ctlv2 2016-06-20 15:17:03 -07:00
c4df15ff3e vendor: codegangsta/cli to urfave/cli
For https://github.com/coreos/etcd/issues/3901.
2016-06-20 15:06:20 -07:00
ce180bbaf1 Merge pull request #5685 from heyitsanthony/multictx-watcher
clientv3: watch with arbitrary ctx values
2016-06-20 14:52:40 -07:00
d5696cb6ef Merge pull request #5712 from gyuho/curl_v3
e2e: grpc-gateway cURL tests
2016-06-20 14:48:29 -07:00
b4f0a8853b e2e: grpc-gateway cURL tests 2016-06-20 14:29:10 -07:00
1097d63ff7 clientv3/integration: test WithRequireLeader on Watch 2016-06-20 14:26:16 -07:00
2bd5d66596 benchmark: fix watch bench 2016-06-20 14:00:46 -07:00
a01f5a2786 Merge pull request #5728 from gyuho/log_dir
etcd-agent: set up directory for etcd logs
2016-06-20 13:25:25 -07:00
722f5b2a8c clientv3: watch with arbitrary ctx values
Sets up a new watch stream for every unique set of ctx values.

Fixes #5354
2016-06-20 12:44:51 -07:00
e5583b26eb Merge pull request #5711 from xiang90/client_bytes
*: add client network metrics
2016-06-20 12:03:18 -07:00
50f2f984e4 etcd-agent: set up directory for etcd logs 2016-06-20 11:32:14 -07:00
35fd81e465 *: add client network metrics 2016-06-20 11:18:06 -07:00
fb1f1ce1fd Merge pull request #5727 from xiang90/fix_watch_bench
benchmark: correctly count number of watchers
2016-06-20 11:00:18 -07:00
2a2dd1075f benchmark: correctly count number of watchers 2016-06-20 10:37:17 -07:00
729f5b45fd Merge pull request #5720 from xiang90/report_recv
*: fix pending events metrics
2016-06-20 06:44:16 -07:00
6e717775a8 Merge pull request #5723 from mitake/etcdctl-misc
etcdctl: slightly enhance output of role revoke-permission
2016-06-20 06:14:28 -07:00
0173564122 etcdctl: slightly enhance output of role revoke-permission 2016-06-20 16:57:50 +09:00
6f28b43806 *: fix pending events metrics 2016-06-19 23:00:39 -07:00
8111e0f7dc Merge pull request #5716 from ajityagaty/get_filtering
v3api: Add a flag to RangeRequest to return only the keys.
2016-06-19 14:50:15 -07:00
ad5d55dd4c v3api: Add a flag to RangeRequest to return only the keys.
Currently the user can't list only the keys in a prefix search. In
order to support such operations the filtering will be done on the
server side to reduce the encoding and network transfer costs.
2016-06-19 14:18:39 -07:00
23621387fc Merge pull request #5714 from gyuho/wal_dir
*: use fileutil.TouchDirAll
2016-06-19 12:02:46 -07:00
d37e564eaa etcdserver: use TouchDirAll 2016-06-19 11:26:52 -07:00
ce50ee14d8 Merge pull request #5710 from xiang90/rm_la
*: remove old flag support
2016-06-19 06:57:28 -07:00
eaa72dfa0b Merge pull request #5709 from gyuho/docker
update: Dockerfile, documentation
2016-06-18 19:57:34 -07:00
b03c832bed Merge pull request #5698 from gyuho/documentation
Documentation: grpc-gateway
2016-06-17 15:33:18 -07:00
3ddfa16c46 Documentation: update container.md 2016-06-17 15:22:13 -07:00
eec706b9ae etcdserverpb: generate Swagger API JSON 2016-06-17 15:19:32 -07:00
09e5db5a46 Documentation: add grpc-gateway doc 2016-06-17 15:19:28 -07:00
8ea6be38ba *: remove old flag support
These legacy flags support are here only because we do not want
CoreOS updates to break people.

Now people will be aware of that they switch to etcd3. Do not need
to support 0.x flags any more.
2016-06-17 14:51:45 -07:00
c25ff426af Dockerfile: build image with alpine 2016-06-17 14:42:40 -07:00
6dcd020d7d Merge pull request #5707 from heyitsanthony/test-all
test: don't hardcode packages for testing
2016-06-17 14:08:46 -07:00
3f6619ada9 Merge pull request #5708 from xiang90/pending
*: add pending/failed proposal metrics
2016-06-17 13:48:39 -07:00
9feb3d0e51 etcd-tester: fix goword warnings 2016-06-17 13:37:35 -07:00
f7b84d69a4 etcd-agent/client: fixup godocs 2016-06-17 13:37:35 -07:00
ea21b8ee1f lessor: fix go vet, goword warnings, and unreliable test 2016-06-17 13:37:25 -07:00
016be1ef31 contrib/recipes: fix govet and goword warnings 2016-06-17 13:13:09 -07:00
598fa7a10e *: add pending/failed proposal metrics 2016-06-17 13:09:38 -07:00
aa503f84d5 Merge pull request #5705 from xiang90/metrics_peer
*: add peer prefix for network metrics between peers
2016-06-17 12:38:06 -07:00
bd8627c8ab Merge pull request #5706 from xiang90/app_metrics
etcdserver: add applied metrics
2016-06-17 12:26:23 -07:00
6af0917812 *: add peer prefix for network metrics between peers 2016-06-17 11:59:49 -07:00
57474697af etcdserver: add applied metrics 2016-06-17 11:52:50 -07:00
74b13aab61 grpcproxy: fix go vet warnings 2016-06-17 11:41:49 -07:00
6c0882145a test: don't use hardcoded package lists for testing 2016-06-17 11:41:49 -07:00
e4f56c4eb6 Merge pull request #5701 from xiang90/rm_exp
*: make auto-compaction-retention non-experimental
2016-06-17 11:02:43 -07:00
8bb0ce54e6 Merge pull request #5704 from gyuho/agent_fix
etcd-agent: fix test
2016-06-17 10:58:14 -07:00
61659302db Merge pull request #5703 from gyuho/grpc_proto
Update gRPC, gogo/protobuf
2016-06-17 10:56:38 -07:00
63c13e8b98 etcd-agent: fix test 2016-06-17 10:47:15 -07:00
63901be674 *: regenerate proto 2016-06-17 10:22:28 -07:00
d03a3d141e vendor: update gRPC dependency 2016-06-17 10:22:16 -07:00
b0d7455fb1 scripts: use latest gogo/protobuf for proto files
For https://github.com/coreos/etcd/issues/5671.
2016-06-17 10:21:18 -07:00
d68664841c *: make auto-compaction-retention non-experimental 2016-06-17 10:04:31 -07:00
3488555bc3 Merge pull request #5674 from mitake/auth-v3-get-users-roles
*: support getting all users and roles in auth v3
2016-06-17 06:51:47 -07:00
18253e2723 *: support getting all users and roles in auth v3
This commit expands RPCs for getting user and role and support list up
all users and roles. etcdctl v3 is now support getting all users and
roles with the newly added option --all e.g. etcdctl user get --all
2016-06-17 16:22:41 +09:00
cc4f35887c Merge pull request #5699 from gyuho/readme
README: more demos, links
2016-06-16 22:57:31 -07:00
dde2aea214 Documentation: add 'migrate' command example 2016-06-16 19:47:57 -07:00
1066c9b806 README: add dash, play.etcd.io, animated demo link 2016-06-16 19:47:25 -07:00
2d08e093c1 Merge pull request #5696 from xiang90/fix_panic
etcdserver: fix panic when getting header of raft request
2016-06-16 13:58:50 -07:00
adff458895 etcdserver: fix panic when getting header of raft request 2016-06-16 13:42:10 -07:00
b3558894f2 Merge pull request #5695 from gyuho/proto
*: use latest protodoc, regenerate
2016-06-16 12:35:44 -07:00
c98ca2db43 Merge pull request #5493 from mqliang/cache
proxy: cache range request in proxy
2016-06-16 12:30:13 -07:00
5f5c3c8f82 Merge pull request #5694 from xiang90/comp
etcdserver: only pause compaction when sending snapshot
2016-06-16 12:26:55 -07:00
7f9adfd5b8 Merge pull request #5692 from xiang90/fix_live
raft: make tick unblock and fix potential live lock
2016-06-16 12:26:31 -07:00
0bae7b635c *: regenerate proto, doc 2016-06-16 11:57:46 -07:00
d26d006fd6 scripts: use latest protodoc to skip grpc-gateway
protodoc now skips grpc-gateway options
2016-06-16 11:57:05 -07:00
1c6070ccc7 Merge pull request #5693 from Jiaweizdev/update-port-number-in-proxy-doc
doc: update port number in proxy doc
2016-06-16 08:59:39 -07:00
699e76b631 etcdserver: only pause compaction when sending snapshot 2016-06-16 08:57:02 -07:00
fb165fcc58 doc: update port number 2016-06-16 17:13:52 +02:00
848f539536 raft: make tick unblock and fix potential live lock 2016-06-16 08:01:06 -07:00
49266dca2d Merge pull request #5690 from xiang90/fix_s
etcdserver: save state before save snapshot
2016-06-15 22:36:30 -07:00
9c78cda088 etcdserver: save state before save snapshot 2016-06-15 22:00:33 -07:00
b07fbbf27c Merge pull request #5687 from mitake/auth-v3-txn-2
etcdserver: permission checking of Txn() in authApplierV3
2016-06-16 12:51:10 +09:00
cdf1a2ee2c etcdserver: permission checking of Txn() in authApplierV3 2016-06-15 20:10:16 -07:00
5385ca0a43 Merge pull request #5659 from heyitsanthony/bridge-more-errors
bridge: packet corruption and reordering
2016-06-15 19:23:22 -07:00
11869905ae bridge: packet corruption and reordering
With bonus bridge connection code refactor.
2016-06-15 17:08:19 -07:00
555976ea84 Merge pull request #5684 from gyuho/test
etcd-agent: SIGQUIT when cleanup
2016-06-15 16:13:20 -07:00
bc69142940 Merge pull request #5683 from xiang90/fix_refresh
store: copy old value when refresh + cas
2016-06-15 16:11:26 -07:00
bd604a029e etcd-agent: SIGQUIT when cleanup 2016-06-15 16:03:25 -07:00
df56f9d6f9 store: copy old value when refresh + cas 2016-06-15 15:32:58 -07:00
b607b36a6c Merge pull request #5648 from ingvagabund/doc-nits
docs: Clustering.md: Switch "command line" and "environment variables"
2016-06-15 15:24:20 -07:00
c505f03c62 Merge pull request #5682 from cdancy/patch-2
Documentation: add gradle-etcd-rest-plugin to libraries-and-tools.md
2016-06-15 15:02:51 -07:00
f392370f73 Documentation: add gradle-etcd-rest-plugin to libraries-and-tools.md
Add link to the gradle-etcd-rest-plugin client under the 'Gradle plugins' sub-section.

Fixes #5681
2016-06-15 17:59:50 -04:00
7d666ab8b9 Merge pull request #5677 from gyuho/minor_etcdserver_fix
etcdserver: preallocate slices
2016-06-15 13:20:06 -07:00
32d766d749 etcdserver: preallocate slice 2016-06-15 13:03:10 -07:00
b98fa063c8 Merge pull request #5672 from heyitsanthony/applier-auth-layer
auth, etcdserver: separate auth checking apply from core apply
2016-06-15 10:06:34 -07:00
16db9e68a2 auth, etcdserver: separate auth checking apply from core apply 2016-06-15 09:03:27 -07:00
5676c5cf26 proxy: serve range request from proxy cache if set serializable 2016-06-15 14:12:36 +08:00
eca38c109a vendor:add groupcache lru package 2016-06-15 14:12:36 +08:00
16d86fd4f8 Merge pull request #5669 from xiang90/proto-gw
main: add grpc-gateway support
2016-06-14 17:46:00 -07:00
7f569a163c test: go vet should only test the go code in the dir 2016-06-14 17:09:06 -07:00
252adc0caf *: update dependencies 2016-06-14 17:09:06 -07:00
5a7b7f7595 main: add grpc-gateway support
Now etcd can serve HTTP json request at /v3alpha/
2016-06-14 17:09:06 -07:00
a6fec46c0e Merge pull request #5652 from gyuho/version
etcdctl/*: print API version
2016-06-14 16:04:24 -07:00
1e38ab1706 etcdctl: print API version (v2, v3 separate) 2016-06-14 15:33:39 -07:00
6958334db2 Merge pull request #5662 from xiang90/auth_delete
*: support deleteRange perm checking
2016-06-13 20:13:43 -07:00
c97107cf81 Merge pull request #5660 from heyitsanthony/fix-watch-test
e2e: don't Put() after watchTest finishes
2016-06-13 19:39:30 -07:00
a571bd0271 Merge pull request #5661 from xiang90/fix_subset
auth: fix remove subset when there are equal ranges
2016-06-13 19:03:10 -07:00
c75fa6fdc9 *: support deleteRange perm checking 2016-06-13 17:49:13 -07:00
e67613830e auth: fix remove subset when there are equal ranges 2016-06-13 17:13:55 -07:00
d78ef8bc72 e2e: don't Put() after watchTest finishes
Fixes #5598
2016-06-13 16:55:02 -07:00
a26ebfb675 Merge pull request #5654 from xiang90/auth_key
auth: add key support in merge func
2016-06-13 16:53:36 -07:00
38546a9d24 auth: use bytes equal when possible 2016-06-13 16:37:21 -07:00
390c89b7f9 auth: remove the special checking case for key auth 2016-06-13 16:37:20 -07:00
9be65414eb auth: add key support in merge func 2016-06-13 16:37:20 -07:00
2a018240e7 Merge pull request #5657 from gyuho/cleanup
etcd-tester: cleanup in compact error, log level
2016-06-13 15:15:52 -07:00
84953365a2 etcd-tester: cleanup in compact error, log level 2016-06-13 14:54:53 -07:00
18851e70b6 Merge pull request #5656 from gyuho/auth_bytes
make auth key, rangeEnd typed like mvcc ([]byte)
2016-06-13 14:41:19 -07:00
5d6af0b51f etcdserver: key, rangeEnd in []byte for auth 2016-06-13 14:21:25 -07:00
e9d2eb2b54 auth: key, range in []byte type
Fix https://github.com/coreos/etcd/issues/5655.
2016-06-13 14:21:22 -07:00
70a2add2b0 Merge pull request #5650 from gyuho/wal_update
wal: use bytes.Equal, other minor updates
2016-06-13 09:05:53 -07:00
b4aa4607cb wal: use bytes.Equal, other minor updates
- Replace reflect.Equal with bytes.Equal where possible
- Remove some TODOs
- Some minor simplifications
2016-06-13 01:33:53 -07:00
2e29bea8fe docs: Clustering.md: Switch command line and environment variables to reflect the order of examples right below 2016-06-13 10:23:21 +02:00
f25b3dbfc8 Merge pull request #5640 from xiang90/permcheck
auth: clean permission checking
2016-06-12 18:26:21 -07:00
667093bbd1 Merge pull request #5645 from gyuho/wal_simple
wal: simplify boolean return
2016-06-11 11:10:59 -07:00
3243795522 wal: simplify boolean return 2016-06-11 10:36:52 -07:00
4aaf7f94cf Merge pull request #5643 from hongchaodeng/doc-fix
v3 docs: ErrCompaction -> ErrCompacted
2016-06-11 01:40:58 -07:00
c11418b56c docs: v3 api, ErrCompaction -> ErrCompacted 2016-06-10 21:53:06 -07:00
bdb5a321d1 Merge pull request #5642 from gyuho/client
vendor: update grpc dependency
2016-06-10 21:09:52 -07:00
5225a4e4bc clientv3: fix client for grpc change
Fix https://github.com/coreos/etcd/issues/5638.
2016-06-10 20:40:46 -07:00
b2a531d5a3 vendor: update grpc dependency
For 59486d9c17
2016-06-10 20:40:06 -07:00
1bbe09eb3c auth: clean permission checking 2016-06-10 19:23:20 -07:00
cff5851956 Merge pull request #5639 from mitake/email
MAINTAINERS: updating email address of Hitoshi Mitake
2016-06-10 18:40:41 -07:00
6b80f0ad7e MAINTAINERS: updating email address of Hitoshi Mitake
I'm mainly using the updated email address for working.
2016-06-10 18:12:39 -07:00
ae366ba4f1 Merge pull request #5637 from xiang90/auth_clean
auth: cleanup get perm func
2016-06-10 18:12:07 -07:00
f99ff5d513 auth: cleanup get perm func 2016-06-10 16:36:51 -07:00
3eab6bef6a Merge pull request #5635 from xiang90/cl
auth: clean up range_perm_cache.go
2016-06-10 16:08:54 -07:00
c802c23e6d Merge pull request #5636 from xiang90/mt
MAINTAINERS: add Hitoshi as a maintainer of auth pkg
2016-06-10 16:07:04 -07:00
43db5515e7 MAINTAINERS: add Hitoshi as a maintainer of auth pkg 2016-06-10 15:55:57 -07:00
c6fae5d566 Merge pull request #5631 from raoofm/patch-8
Doc: Fault tolerance table
2016-06-10 15:49:36 -07:00
175c67a552 Merge pull request #5634 from gyuho/wal
wal: PrivateFileMode/DirMode as in pkg/fileutil
2016-06-10 15:41:43 -07:00
65ff76882b Merge pull request #5624 from xiang90/warn_apply
etcdserver: warn heavy apply
2016-06-10 15:28:27 -07:00
47d5257622 pkg/fileutil: expose PrivateFileMode/DirMode 2016-06-10 15:22:14 -07:00
77efe4cda9 auth: clean up range_perm_cache.go 2016-06-10 15:21:04 -07:00
4570eddc2c wal: PrivateFileMode/DirMode as in pkg/fileutil
To make it consistent with pkg/fileutil
2016-06-10 15:20:57 -07:00
3210bb8181 Merge pull request #5632 from xiang90/auth_store_cleanup
auth: cleanup store.go
2016-06-10 14:49:56 -07:00
a92ea417b4 Merge pull request #5534 from gyuho/readme
README: minor fix in README
2016-06-10 14:46:15 -07:00
64eccd519d etcdserver: warn heavy apply 2016-06-10 14:43:34 -07:00
bb6102c00c Merge pull request #5630 from xiang90/del_user
auth: add del functions for user/role
2016-06-10 14:28:36 -07:00
f8c1a50195 auth: cleanup store.go 2016-06-10 14:19:29 -07:00
2781553a9e Merge pull request #5615 from mitake/auth-v3-consistent-token
auth, etcdserver: make auth tokens consistent for all nodes
2016-06-10 14:19:21 -07:00
37ac90c419 Doc: Fault tolerance table 2016-06-10 17:12:36 -04:00
8776962008 auth: add del functions for user/role 2016-06-10 14:11:00 -07:00
ead5096fa9 auth, etcdserver: make auth tokens consistent for all nodes
Currently auth tokens are generated in the replicated state machine
layer randomly. It means one auth token generated in node A cannot be
used for node B. It is problematic for load balancing and fail
over. This commit moves the token generation logic from the state
machine to API layer (before raft) and let all nodes share a single
token.

Log index of Raft is also added to a token for ensuring uniqueness of
the token and detecting activation of the token in the cluster (some
nodes can receive the token before generating and installing the token
in its state machine).

This commit also lets authStore have simple token related things. It
is required because of unit test. The test requires cleaning of the
state of the simple token things after one test (succeeding test can
create duplicated token and it causes panic).
2016-06-10 13:55:37 -07:00
65abcc1a59 Merge pull request #5629 from xiang90/put_role
auth: cleanup
2016-06-10 13:53:34 -07:00
cf99d596f5 auth: cleanup get user and get role usage 2016-06-10 13:34:40 -07:00
0914d65c1f auth: add put role 2016-06-10 13:20:48 -07:00
e854fa1856 Merge pull request #5622 from heyitsanthony/e2e-auth-keys
e2e: auth key put test
2016-06-10 12:17:38 -07:00
cd569d640b Merge pull request #5600 from lucab/to-upstream/armored-sigs
doc: sign release artifacts in armor mode
2016-06-10 12:11:53 -07:00
aa56e47712 Merge pull request #5625 from xiang90/put_user
auth: add put_user
2016-06-10 12:10:21 -07:00
1e22137a9a e2e: test auth is respected for Puts 2016-06-10 11:43:06 -07:00
b3a0b0502c etcdserver: respect auth on serialized Range 2016-06-10 11:43:05 -07:00
ae30ab7897 auth: add put_user 2016-06-10 11:27:42 -07:00
247103c40b Merge pull request #5623 from xiang90/get_role
auth: add getRole
2016-06-10 11:17:59 -07:00
1958598a18 auth: add getRole 2016-06-10 10:59:34 -07:00
c459073c6d Merge pull request #5620 from xiang90/auth_recover
auth: implement recover
2016-06-10 10:35:03 -07:00
05f9d1b716 Merge pull request #5610 from gyuho/handle_timeout_error
etcd-tester: do not exit for compaction timeout
2016-06-10 09:47:54 -07:00
5631acdb8f etcd-tester: do not exit for compact timeout
Temporary fix for https://github.com/coreos/etcd/issues/5606.
2016-06-10 09:44:45 -07:00
ca4e78687e auth: implement recover 2016-06-10 09:37:37 -07:00
bdc7035c10 Merge pull request #5617 from liggitt/preallocation
fileutil: avoid double preallocation
2016-06-09 22:27:17 -07:00
4f7622fb9a fileutil: avoid double preallocation 2016-06-10 00:27:59 -04:00
d4ac09de0f Merge pull request #5612 from gyuho/index_bench
mvcc: add keyIndex, treeIndex Restore benchmark
2016-06-09 16:09:56 -07:00
6e32e8501a Merge pull request #5613 from xiang90/rootrole
*: add admin permission checking
2016-06-09 16:00:37 -07:00
7da1940dce Merge pull request #5607 from xiang90/raft_user
raft: add docker/swarmkit as notable raft users
2016-06-09 15:39:09 -07:00
f1c6fa48f5 *: add admin permission checking 2016-06-09 15:25:09 -07:00
6bbd8b7efb mvcc: add keyIndex benchmark test
Useful later when trying to optimize our restore operations.
2016-06-09 14:13:18 -07:00
a7c5058953 Merge pull request #5608 from heyitsanthony/clientv3-auth-opts
clientv3: use separate dialopts for auth dial
2016-06-09 12:56:59 -07:00
349eaf117a clientv3: use separate dialopts for auth dial
Needs to use a different balancer from the main client connection
because of the way grpc uses the Notify channel.
2016-06-09 10:38:57 -07:00
ab65d2b848 raft: add docker/swarmkit as notable raft users 2016-06-09 10:10:44 -07:00
78c957df41 Merge pull request #5603 from heyitsanthony/clientv3-close-keepalive
clientv3: close keepalive channel if TTL locally exceeded
2016-06-09 09:44:32 -07:00
0554ef9c39 clientv3/integration: tests for closing lease channel 2016-06-09 09:12:59 -07:00
e534532523 clientv3: close keep alive channel if no response within TTL 2016-06-09 09:12:59 -07:00
fb0df211f0 Merge pull request #5586 from xiang90/root
auth: add root user and root role
2016-06-09 00:23:45 -07:00
da2f2a5189 auth: add root user and root role 2016-06-08 19:55:08 -07:00
a548cab828 Merge pull request #5602 from gyuho/get_leader
clientv3/integration: WaitLeader to follower
2016-06-08 17:03:25 -07:00
753073198f clientv3/integration: WaitLeader to follower
Fix https://github.com/coreos/etcd/issues/5601.
2016-06-08 16:45:32 -07:00
77dee97c2f Merge pull request #5578 from mitake/auth-v3-range
auth, etcdserver: permission of range requests
2016-06-08 16:33:25 -07:00
253e313c09 *: support granting and revoking range
This commit adds a feature for granting and revoking range of keys,
not a single key.

Example:
$ ETCDCTL_API=3 bin/etcdctl role grant r1 readwrite k1 k3
Role r1 updated
$ ETCDCTL_API=3 bin/etcdctl role get r1
Role r1
KV Read:
        [a, b)
        [k1, k3)
        [k2, k4)
KV Write:
        [a, b)
        [k1, k3)
        [k2, k4)
$ ETCDCTL_API=3 bin/etcdctl --user u1:p get k1 k4
k1
v1
$ ETCDCTL_API=3 bin/etcdctl --user u1:p get k1 k5
Error:  etcdserver: permission denied
2016-06-08 14:58:25 -07:00
9dad78c68f Merge pull request #5599 from gyuho/e2e_fix
e2e: fix race in ranging test tables
2016-06-08 14:46:02 -07:00
bd5e1ea1c0 e2e: fix race in ranging test tables
Fix https://github.com/coreos/etcd/issues/5598.

race conditions were detected in iterating the test table
because the go func closure doesn't receive the 'puts' index
in the argument. This can cause the test to run wrong put
operations.
2016-06-08 13:44:05 -07:00
87d105c036 Merge pull request #5596 from heyitsanthony/wal-warn-slow-fsync
wal: warn if sync exceeds a second
2016-06-08 13:07:13 -07:00
6bb96074da auth, etcdserver: permission of range requests
Currently the auth mechanism doesn't support permissions of range
request. It just checks exact matching of key names even for range
queries. This commit adds a mechanism for setting permission to range
queries. Range queries are allowed if a range of the query is [begin1,
end1) and the user has a permission of reading [begin2, range2) and
[begin1, end2) is a subset of [begin2, range2). Range delete requests
will follow the same rule.
2016-06-08 11:57:32 -07:00
35329a1674 Merge pull request #5597 from gyuho/btree_dep
*: update google/btree dependency
2016-06-08 11:39:29 -07:00
0b7e5c70a5 *: update google/btree dependency 2016-06-08 11:23:49 -07:00
39eaa37dcf wal: warn if sync exceeds a second 2016-06-08 11:03:18 -07:00
ff2b24a8ac Merge pull request #5583 from heyitsanthony/grpc-nuke-waitstate
clientv3: use grpc balancer
2016-06-08 09:45:44 -07:00
4a13c9f9b3 clientv3: use grpc balancer 2016-06-08 09:24:13 -07:00
e551aec339 doc: sign release artifacts in armor mode
Release guide steps to artifacts signing defaults to binary
signatures, while producing .asc files.
This commit changes to armored signatures, also matching appc
requirements.

Fixes #5594
2016-06-08 17:51:54 +02:00
66a6ed63cb Merge pull request #5585 from xiang90/token_cleanup
etcdserver: make usernameFromCtx more go style
2016-06-08 08:08:58 -07:00
4d56f54898 Merge pull request #5590 from xiang90/user
auth: add getuser
2016-06-08 08:08:36 -07:00
7abc8f21eb integration: update tests for new grpc reconnection interface 2016-06-08 01:04:59 -07:00
62f8ec25c0 clientv3: use grpc reconnection logic 2016-06-08 01:04:59 -07:00
1823702cc6 integration: bridge connections to grpc server
Tests need to disconnect the network connection for the client to check
reconnection paths but closing a grpc connection closes the logical connection.
To disconnect the client, instead have a bridge between the server and
the client which can monitor and reset connections.
2016-06-08 00:34:53 -07:00
b382c2c86f vendor: update grpc 2016-06-07 22:46:43 -07:00
c6496dcff6 auth: add getuser 2016-06-07 22:43:04 -07:00
3e057129e2 Merge pull request #5588 from purpleidea/fix/test-typo
e2e: tests: fix small typo
2016-06-07 22:25:57 -07:00
0048782d97 e2e: tests: fix small typo
Found when trying to get the e2e tests to run on Fedora which they
don't because of https://github.com/kr/pty/issues/21
2016-06-08 01:14:11 -04:00
2da6fb6616 Merge pull request #5587 from gyuho/function
etcd-tester: retry for 'etcdserver: not capable'
2016-06-07 22:01:07 -07:00
350673f1f8 etcd-tester: retry for 'etcdserver: not capable'
Fix https://github.com/coreos/etcd/issues/5573.

Currently stresser starts at the same time as cluster start.
If the stresser got launched too fast/early, all stressers
exit from the error 'etcdserver: not capable', which
means the cluster is not ready yet. This adds additional
error checking, so stresser can retry.
2016-06-07 21:56:04 -07:00
cc1155c93b etcdserver: make usernameFromCtx more go style 2016-06-07 21:17:32 -07:00
9a14b796e0 Merge pull request #5582 from gyuho/watch_range_end
etcdctl: support watch with range_end
2016-06-07 17:08:49 -07:00
7eaf73d273 e2e: test watch command with 2 args 2016-06-07 16:52:19 -07:00
624d5eb0cb etcdctl: support range_end for watch command
Fix https://github.com/coreos/etcd/issues/5575.
2016-06-07 16:52:15 -07:00
50ef8f148c Merge pull request #5579 from gyuho/request_union
RequestOp, ResponseOp
2016-06-07 13:54:59 -07:00
1610391449 *: following changes for proto update 2016-06-07 13:33:03 -07:00
1e4d3603db clientv3,ctlv3: following changes for proto change 2016-06-07 13:32:36 -07:00
6e149e3485 etcdserver: following updates for proto change 2016-06-07 13:32:07 -07:00
ca630a0803 etcdserverpb: RequestOp, ResponseOp
Fix https://github.com/coreos/etcd/issues/5504.
2016-06-07 13:31:10 -07:00
0d1133178f Merge pull request #5574 from xiang90/auth
auth: make naming consistent
2016-06-07 11:24:29 -07:00
83ce1051ff auth: make naming consistent 2016-06-07 10:54:50 -07:00
4984d82d27 Merge pull request #5570 from heyitsanthony/rafthttp-snapshot-tests
rafthttp: snapshot testing
2016-06-06 16:02:22 -07:00
7f461b2df9 Merge pull request #5572 from heyitsanthony/fallocate-eintr-fallback
pkg/fileutil: fall back to truncate() if fallocate is interrupted
2016-06-06 15:24:42 -07:00
dc91da50b5 rafthttp: snapshot tests 2016-06-06 11:38:11 -07:00
93f114c76c snap: return errors if Message's snapshot is not entirely read 2016-06-06 11:38:11 -07:00
3aadb25c31 pkg/ioutil: exact readcloser
NewExactReadCloser wraps a ReadCloser so it returns errors if exact number
of bytes are not read.
2016-06-06 11:38:10 -07:00
5be39d2c84 wal: don't preallocate on old tail file
Code is only there to handle an edge case where the tail wasn't preallocated
already (e.g., via old etcd version or a crash). It also triggers tmpfs
corruption, so remove it.
2016-06-06 11:31:25 -07:00
9022137d2b Merge pull request #5567 from gyuho/wal_type
wal: minor fixes
2016-06-06 10:31:03 -07:00
54aac4ab7e pkg/fileutil: fall back to truncate() if fallocate is interrupted
Fixes #5558
2016-06-06 09:52:34 -07:00
008081ffb5 wal: minor fixes
- remove unnecessary type cast
- simply modulo operations
2016-06-06 09:43:19 -07:00
c63eaf45f9 Merge pull request #5566 from ktateish/fix-single-dash
*: replace '-' with '--' for long options
2016-06-05 21:38:29 -07:00
8b75a33398 *: replace '-' with '--' for long options
A long option should have double dashes (cf. #4595),
so are error messages.
2016-06-06 12:25:45 +09:00
3c2a47ea64 Merge pull request #5565 from gyuho/raft_doc
raft: small fix in doc
2016-06-05 19:16:08 -07:00
843c53192a raft: small fix in doc
'MsgBeat' is an internal type to signal the leader, not the message type
that gets sent to its followers. 'MsgHeartbeat' is the type sent to followers.
2016-06-05 17:47:46 -07:00
2baca91ee2 Merge pull request #5564 from mitake/auth-v3-cleaning
cleaning auth v3
2016-06-05 08:40:06 -07:00
94f22e8a07 *: rename RPCs and structs related to revoking
This commit renames RPCs and structs related to revoking.
1. UserRevoke -> UserRevokeRole
2. RoleRevoke -> RoleRevokePermission
2016-06-05 16:57:23 +09:00
60fc1e4d4e auth, etcdserver: error codes for revoking non existing role and permission
This commit adds error codes for representing revoking non existing
role (from user) and permission (from role).
2016-06-05 16:41:10 +09:00
8bebd8caa9 Merge pull request #5559 from gyuho/docker_guide
Documentation: add docker guide for v3
2016-06-04 19:14:48 -07:00
2f00b1e071 Documentation: add docker guide for v3 2016-06-04 16:43:44 -07:00
429b2eee58 Merge pull request #5548 from mitake/auth-v3-revoke-delete
revoke user, revoke role, and delete role in auth v3
2016-06-03 21:44:37 -07:00
c7a1423d45 *: support deleting a role in auth v3
This commit implements RoleDelete() RPC for supporting deleting a role
in auth v3. It also adds a new subcommand "role delete" to etcdctl.
2016-06-04 13:42:45 +09:00
0cb1343109 *: support revoking a key from a role in auth v3
This commit implements RoleRevoke() RPC for supporting revoking a key
from a role in auth v3. It also adds a new subcommand "role revoke" to
etcdctl.
2016-06-04 13:42:45 +09:00
957b07c408 *: support revoking a role from a user in auth v3
This commit implements UserRevoke() RPC for supporting revoking a role
from a user in auth v3. It also adds a new subcommand "user revoke" to
etcdctl.
2016-06-04 13:39:26 +09:00
3f1af453b9 Merge pull request #5560 from gyuho/lease_test
clientv3/integration: test lease closed connection
2016-06-03 18:23:03 -07:00
0cb4dd4331 clientv3/integration: test lease closed connection
Tests if lease operations return ErrConnClosed when
the client is closed.
2016-06-03 16:41:32 -07:00
6a35833fc3 Merge pull request #5450 from luxas/more_arches
travis: Catch compilation errors in CI for arm and ppc64le
2016-06-03 16:12:28 -07:00
c093234e3a Merge pull request #5557 from heyitsanthony/fix-watcher-cancel
mvcc: don't cancel watcher if stream is already closed
2016-06-03 11:28:54 -07:00
88afb0b0a6 Merge pull request #5543 from heyitsanthony/clientv3-unblock-reconnect
clientv3: don't hold client lock while dialing
2016-06-03 11:28:44 -07:00
6187d812da Merge pull request #5556 from xiang90/r_test
raft: fix TestNodeStepUnblock
2016-06-03 11:13:43 -07:00
f57b4eb46d mvcc: don't cancel watcher if stream is already closed
Close() already cancels all the watchers but doesn't bother to clear out
the bookkeeping maps so Cancel() may try to cancel twice.

Fixes #5533
2016-06-03 11:12:46 -07:00
7dfe7db243 clientv3: panic if ActiveConnection tries to return non-nil connection 2016-06-03 10:25:20 -07:00
267d1cb16f clientv3: fix watch to reconnect on failure
It was spinning before.
2016-06-03 10:25:20 -07:00
5f5a203e27 clientv3: don't hold client lock while dialing
Causes async reconnect to block while the client is dialing.

This was also causing problems with the Close error message, so
now Close() will return the last dial error (if any) instead of
clearing it out with a cancel().

Fixes #5416
2016-06-03 10:25:20 -07:00
500296d0fb raft: fix TestNodeStepUnblock
The test cases have side-effect. We need to stop testing if one of the test
fails. Also timeout should be much longer to avoid false-positive.
2016-06-03 10:22:11 -07:00
948dc5e425 Merge pull request #5552 from ktateish/fix-wrong-link
Fix wrong links
2016-06-03 10:06:13 -07:00
634b9584ef Merge pull request #5555 from xiang90/fix_rm
rafthttp: report error to correct chan
2016-06-03 09:48:43 -07:00
5183631f17 rafthttp: report error to correct chan 2016-06-03 09:18:02 -07:00
95fc21e38b travis: Catch compilation errors in CI for arm and ppc64le 2016-06-03 18:46:36 +03:00
5bff4d85d6 Doc: fix links using url for internal doc 2016-06-03 22:26:01 +09:00
9585daf0a9 Doc: fix wrong links and remove unused or duplicate ones 2016-06-03 22:23:57 +09:00
b3fee0abff Merge pull request #5539 from mitake/auth-v3-get-role
*: support getting role in auth v3
2016-06-02 21:48:45 -07:00
10ee69b44c *: support getting role in auth v3
This commit implements RoleGet() RPC of etcdserver and adds a new
subcommand "role get" to etcdctl v3. It will list up permissions that
are granted to a given role.

$ ETCDCTL_API=3 bin/etcdctl role get r1
Role r1
KV Read:
        b
        d
KV Write:
        a
        c
        d
2016-06-03 13:03:54 +09:00
755567cb3d Merge pull request #5547 from xiang90/int
integration: always return active client
2016-06-02 15:52:38 -07:00
bbfe7f401f integration: always return active client
In the integration test, we sometimes stop/restart an etcd server.
Now our client has internal connection monitoring logic that might
set conn to nil when there is a connection failure and the redial
also fails.

Chaning randClient to always return a client with active connection
to make integration test reliable.
2016-06-02 14:49:32 -07:00
85691dbbe5 Merge pull request #5546 from raoofm/patch-6
Doc: fix link for migrate command in v2-migration
2016-06-02 14:21:36 -07:00
6ac67ecd5c Doc: fix link for migrate command in v2-migration
Doc: fix link for migrate command in v2-migration
2016-06-02 17:19:43 -04:00
6d96dd581a Merge pull request #5545 from heyitsanthony/revert-more-i64
Revert "etcdserverpb: make RangeResponse.More an int64"
2016-06-02 14:09:31 -07:00
84a487f723 Revert "etcdserverpb: make RangeResponse.More an int64"
This reverts commit 84e1ab8765.
2016-06-02 13:43:40 -07:00
3005f2717f Merge pull request #5541 from xiang90/tls
transport: require tls12
2016-06-02 10:11:57 -07:00
8b28c647ea transport: require tls12 2016-06-02 09:38:56 -07:00
51a048e6b3 Merge pull request #5540 from xiang90/fix_snap
snap: fix write snap
2016-06-02 09:12:50 -07:00
2b77e9a086 Merge pull request #5538 from rustyrobot/fix-header-formatting
doc: fix header formatting
2016-06-02 07:58:27 -07:00
ab0ccdc4df snap: fix write snap
Do not use writeFile since it does not sync file before closing.
This can lead to slient file corruption when disk is full.
2016-06-02 07:38:48 -07:00
9098f27745 doc: fix header formatting 2016-06-02 16:15:08 +03:00
ab3398f7fd README: minor fix in README 2016-06-01 23:33:59 -07:00
29d2caf14a Merge pull request #5532 from xiang90/rh
rafthttp: simplify initialization funcs
2016-06-01 22:31:19 -07:00
a047aa4a81 rafthttp: rename to to peerID 2016-06-01 22:12:47 -07:00
c25c00fcf9 rafthttp: simplify initialization funcs 2016-06-01 21:47:46 -07:00
2fcac66605 Merge pull request #5530 from gyuho/build_script
scripts: include v2 README in the release
2016-06-01 20:59:33 -07:00
140e2a18fb Merge pull request #5492 from mitake/auth-v3-user-get
*: support getting user in etcdctl v3
2016-06-01 20:27:18 -07:00
5609fdb9a8 *: support getting user in etcdctl v3
This commit adds a new subcommand "user get" to etcdctl v3. It will
list up roles that are granted to a given user.

Example:
$ ETCDCTL_API=3 bin/etcdctl user get u1
User: u1
Roles: r1 r2 r3

This commit also modifies the layout of InternalRaftRequest for
frequent update of auth related members.
2016-06-02 12:10:19 +09:00
b95c5b7da9 Merge pull request #5526 from heyitsanthony/more-to-int64
etcdserverpb: make RangeResponse.More an int64
2016-06-01 20:03:15 -07:00
232c1914d2 scripts: include v2 README in the release 2016-06-01 19:12:34 -07:00
84e1ab8765 etcdserverpb: make RangeResponse.More an int64 2016-06-01 17:10:23 -07:00
9fee7732f6 Merge pull request #5468 from swingbach/master
implemented leader lease when quorum check is on.
2016-06-01 16:10:41 -07:00
337ef64ed5 raft: implemented leader lease when quorum check is on 2016-06-02 06:17:27 +08:00
fb64c8ccfe Merge pull request #5521 from heyitsanthony/clientv3-hide-retrydial
clientv3: hide retry dial api
2016-06-01 13:00:02 -07:00
bea4268a0b Merge pull request #5520 from gyuho/grpc_dep
vendor: update grpc dependency
2016-06-01 11:43:23 -07:00
c451a1b350 Merge pull request #5519 from gyuho/etcdctlv3_README
etcdctl: v3 as default README
2016-06-01 11:41:17 -07:00
240757729c etcdctl: make v3 as default README 2016-06-01 11:36:21 -07:00
22744566f4 clientv3: hide retry dial api 2016-06-01 11:36:16 -07:00
542b7dff64 vendor: update grpc dependency 2016-06-01 11:24:03 -07:00
a6144bdf3e Merge pull request #5507 from xiang90/failure
doc: add failures guide
2016-06-01 11:07:22 -07:00
fc33fd1aa6 doc: add failures guide 2016-06-01 11:06:44 -07:00
47ef5f7ca5 Merge pull request #5510 from gyuho/clientv3_fix
clientv3: watch resp with error when client close
2016-06-01 11:01:30 -07:00
75dc10574a clientv3: watch resp with error when client close 2016-06-01 10:39:48 -07:00
9ed3b446ca Merge pull request #5509 from heyitsanthony/clientv3-fix-concurrent-close
clientv3: fix deadlock on Get with concurrent Close
2016-06-01 07:37:28 -07:00
36fcc9e9d4 Merge pull request #5515 from xiang90/logging
*: more logging on critical state change
2016-06-01 07:04:36 -07:00
a83051d0fc clientv3: don't panic on Get if NewKV is created with a closed client 2016-06-01 05:53:21 -07:00
1d88130522 clientv3: fix deadlock on Get with concurrent Close 2016-06-01 05:53:21 -07:00
5cb7400cee Merge pull request #5508 from heyitsanthony/bench-stm-lock
concurrency, benchmark: additional stm support
2016-06-01 05:48:50 -07:00
8528c8c599 *: more logging on critical state change
Add more logging for better debugging purpose.
2016-05-31 23:31:03 -07:00
fc06dd1452 Merge pull request #5480 from heyitsanthony/fix-migrate-nov2
etcdctl: improve error message on migration without v2 keys
2016-05-31 15:18:56 -07:00
2d4c7d6886 Merge pull request #5506 from xiang90/r_rafthttp
rafthttp: simplify streamReader initilization
2016-05-31 15:00:52 -07:00
51551abef5 concurrency, benchmark: read-committed STM isolation policy 2016-05-31 14:35:27 -07:00
f34a9350c3 benchmark: benchmark stm workload with distributed mutex 2016-05-31 14:35:27 -07:00
bb2a3ea8d8 benchmark: respect stm isolation mode flag 2016-05-31 14:35:27 -07:00
7709cd84bb Merge pull request #5505 from heyitsanthony/v3rpc-watcher-close
v3rpc: fix race on ctrl channel when watcher stream closes
2016-05-31 14:24:10 -07:00
cc837dfc6d Merge pull request #5503 from gyuho/fix_clientv3
clientv3: handle nil connection after *Client.Close (KV)
2016-05-31 12:38:20 -07:00
86269ab5bf rafthttp: simplify streamReader initilization 2016-05-31 12:13:37 -07:00
7b5657cf1a clientv3: check if KV.Client is closed
For https://github.com/coreos/etcd/issues/5495.
2016-05-31 12:00:19 -07:00
d116c116fe clientv3: getRemote comment about release 2016-05-31 12:00:19 -07:00
b0d4a0a9bd integration: skip closed client in Terminate 2016-05-31 12:00:15 -07:00
283318d547 v3rpc: add ErrConnClosed for closed client
For https://github.com/coreos/etcd/issues/5495.
2016-05-31 11:15:01 -07:00
09e8f5782e v3rpc: fix race on closing watcher stream ctrl channel
Sometimes close would race with the recvLoop, leading the
recvLoop to write to a close channel.
2016-05-31 11:07:31 -07:00
41d3cea9b3 integration: test closing stream while creating watchers 2016-05-31 11:02:15 -07:00
310ebdd3e1 Merge pull request #5498 from heyitsanthony/wal-tmpfile-fixes
wal: improve tmp file handling
2016-05-31 11:01:29 -07:00
e39f436728 Merge pull request #5494 from xiang90/refactor_rafthttp
rafthttp: remove the newPipeline func
2016-05-31 09:35:47 -07:00
eb9b281741 Merge pull request #5502 from jonboulle/master
MAINTAINERS: remove extraneous space
2016-05-31 07:07:10 -07:00
05cc3c3dbb wal: limit number of tmp file names
This fixes a space leak if the etcd server is restarted in shorter and shorter
intervals causing the tmp files to stack up.
2016-05-31 06:25:23 -07:00
71a9d6fc8b wal: don't warn when opening wal directory with stale tmp files 2016-05-31 06:25:23 -07:00
6686833e51 e2e: check for empty string as etcdctl backup result
Was checking for an ignored wal file warning. Added support for
TMPDIR since repeated runs were failing on left over test data.
2016-05-31 06:25:23 -07:00
ad95ceea2f MAINTAINERS: remove extraneous space 2016-05-31 12:11:53 +02:00
6f8cc58214 Merge pull request #5490 from mitake/errcode
etcdserver, auth: not return grpc error code directly in the apply phase
2016-05-30 22:00:54 -07:00
cc2e0fad3e Merge pull request #5497 from purpleidea/feat/doc-clarify
docs: fix ordering of sentence so it's logical and more clear
2016-05-30 21:52:04 -07:00
2cd3a3bd59 etcdctl: improve error message on migration without v2 keys
Fixes #5478
2016-05-30 19:14:04 -07:00
9c767cbf98 Merge pull request #5464 from heyitsanthony/fix-victim-watchers
mvcc: tighten up watcher cancelation and revision handling
2016-05-30 20:09:39 -06:00
4aab13ac06 docs: fix ordering of sentence so it's logical and more clear 2016-05-30 22:07:31 -04:00
5144318af0 etcdserver, auth: not return grpc error code directly in the apply phase
Current permission checking mechanism doesn't return its error code
well. The internal error (code = 13) is returned to client and the
retry mechanism doesn't work well. This commit fixes the problem.
2016-05-31 11:04:34 +09:00
ba68d7bbe6 rafthttp: make newRemote simpler 2016-05-30 16:24:26 -07:00
efe0ee7e59 rafthttp: remove the newPipeline func
Using struct to initialize pipeline is better when we have many
fields to file in.
2016-05-30 16:19:50 -07:00
815bc5307f Merge pull request #5489 from linuxcer/master
etcdserver: fix typo in server.go
2016-05-30 15:20:02 -07:00
29cc568659 etcdserver: fix typo in server.go 2016-05-31 05:54:30 +08:00
4e5c24fcf9 Merge pull request #5487 from gyuho/mvcc_proto
mvcc: delete EXPIRE event type
2016-05-29 18:22:15 -07:00
c43e59338f etcdctl/ctlv3: remove mvccpb.EXPIRE in mirror cmd 2016-05-29 15:11:29 -07:00
3266c809e4 mvcc: delete EXPIRE event type
Addressing https://github.com/coreos/etcd/pull/5484#discussion_r65005236.
etcd v3 doesn't expire keys. It's either PUT of DELETE.
2016-05-29 14:54:38 -07:00
84e7fa149e Merge pull request #5439 from mitake/auth-v3-permcheck
do permission check in raft log apply phase
2016-05-28 19:05:31 -07:00
0184288479 Merge pull request #5419 from xiang90/raft_doc
raft: initial readme
2016-05-28 18:38:10 -07:00
5b2e130f09 raft: initial readme 2016-05-28 18:37:21 -07:00
a86ae1d969 Merge pull request #5483 from gyuho/client_typo
clientv3: fix panic message in OpPut
2016-05-28 12:11:56 -07:00
9a0fe2620e clientv3: fix panic message in OpPut 2016-05-28 11:55:28 -07:00
8e821cdc70 *: do permission check in raft log apply phase
This commit lets etcdserver check permission during its log applying
phase. With this change, permission checking of operations is
supported.

Currently, put and range are supported. In addition, multi key
permission check of range isn't supported yet.
2016-05-29 00:05:48 +09:00
90e9652f70 etcdserver: return error of apply result without touching response
Current etcdserver tries to return result.resp even if result.err is
not nil. A situation of result.resp == nil and result.err != nil can
happen and it results an error like below:

18:49:57 etcd1 | interface conversion: proto.Message is nil, not *etcdserverpb.PutResponse

This commit lets the functions return result.err if it is not nil.
2016-05-29 00:05:48 +09:00
cfb3f96c2b mvcc: tighten up watcher cancelation and revision handling
Makes w.cur into w.minrev, the minimum revision for the next update, and
retries cancelation if the watcher isn't found (because it's being processed
by moveVictims).

Fixes: #5459
2016-05-27 17:19:32 -07:00
c438310634 v3rpc: make watcher wait for its send goroutine to finish 2016-05-27 16:54:26 -07:00
20fc3e968f Merge pull request #5465 from gyuho/compact1
etcd-tester: log more for compact errors
2016-05-27 16:16:04 -07:00
099dd1d1fb Merge pull request #5477 from gyuho/readme
README: fix write/sec number
2016-05-27 15:52:27 -07:00
c13bf42ac6 README: fix write/sec number 2016-05-27 15:50:04 -07:00
0313484f17 Merge pull request #5476 from gyuho/latency_dodc
Documentation: add average latency numbers
2016-05-27 15:47:26 -07:00
79fac9ee6f Documentation: add average latency numbers 2016-05-27 15:46:35 -07:00
f7fbcf8209 Merge pull request #5475 from heyitsanthony/doc-pkgs
*: add missing godoc package descriptions
2016-05-27 16:37:49 -06:00
fc7da09d67 *: add missing godoc package descriptions
Fixes #4074
2016-05-27 15:15:26 -07:00
0df5bb0002 Merge pull request #5445 from gyuho/performance_doc
Documentation: add benchmark to performance.md
2016-05-27 15:09:03 -07:00
33daeb7464 Documentation: add benchmark to performance.md
Fix https://github.com/coreos/etcd/issues/5433.
2016-05-27 15:05:54 -07:00
d8f325dabf Merge pull request #5472 from xiang90/fix_cap
integration: move cap enabling to init
2016-05-27 11:42:07 -07:00
ac2859057a integration: move cap enabling to init 2016-05-27 11:12:07 -07:00
2d47211589 Merge pull request #5471 from xiang90/proxy_rand
httpproxy: init the rand that we use to randomize endpoints
2016-05-27 10:46:42 -07:00
c73e8fd946 httpproxy: init the rand that we use to randomize endpoints
This is actually does not change anything. The endpoints are already
randomized before feeding into proxy. But it makes the proxy more safe.
2016-05-27 10:28:03 -07:00
45b872fe5d Merge pull request #5470 from dnaeon/gru
docs: add Gru to the list of projects using etcd
2016-05-27 10:18:55 -07:00
6e4fa5e773 docs: add Gru to the list of projects using etcd 2016-05-27 20:17:57 +03:00
04039eb006 etcd-tester: more logs for compact operations 2016-05-27 09:55:13 -07:00
3ed5d28e2e etcd-tester: fix, clean up multiple things (#5462)
* etcd-tester: more logging, fix typo

* etcd-tester: fix prevCompactRev scope

Fix https://github.com/coreos/etcd/issues/5440.

* etcd-tester: move utils to bottom, clean up logs

And remove stresser operation inside defrag

* etcd-tester: separate update revision call

* etcd-tester: fix cleanup when case is -1
2016-05-26 11:37:49 -07:00
6acb3d67fb Merge pull request #5448 from xiang90/fix_refrsh
etcd: fix refresh feature
2016-05-26 09:53:13 -07:00
44b59e24eb Merge pull request #5455 from heyitsanthony/clientv3-url-endpoints
clientv3: handle URL scheme when given in endpoint
2016-05-26 10:25:27 -06:00
d117684086 Merge pull request #5453 from gyuho/protobuf_etcdctlv3
etcdctl/ctlv3: protobuf write-out for member list
2016-05-25 22:39:54 -07:00
5cba7080bc etcdctl/ctlv3: protobuf write-out for member list
Fix https://github.com/coreos/etcd/issues/5297.
2016-05-25 22:23:57 -07:00
86591d64c5 etcdctl: doc member list, others protobuf output 2016-05-25 22:17:45 -07:00
d7fa07cffa Merge pull request #5456 from gyuho/tester_fix
etcd-tester: fix compact timeout
2016-05-25 18:53:07 -07:00
4c7af825c7 etcd-tester: timeout per number of compact entries
Fix https://github.com/coreos/etcd/issues/5440.
2016-05-25 18:37:13 -07:00
5ab27e99f2 Merge pull request #5454 from gyuho/document_issue_5401
etcdserverpb: document how to prefix, range query
2016-05-25 17:07:53 -07:00
9dc0782f45 clientv3: handle URL scheme when given in endpoint
Fixes #5427
2016-05-25 18:01:36 -06:00
8a718f3e56 etcdserverpb: document prefix, range query
Fix https://github.com/coreos/etcd/issues/5401.
2016-05-25 16:53:36 -07:00
53084ebead etcd: fix refresh feature
When using refresh, etcd store v2 watch is broken. Although with refresh
store should not trigger current watchers, it should still add events into
the watchhub to make a complete history. Current store fails to add the event
into the watchhub, which causes issues.
2016-05-25 13:33:31 -07:00
9ea1705563 Merge pull request #5441 from mqliang/Rlock-GET
store: use Rlock when GET
2016-05-25 11:29:36 -07:00
84ded59f08 Merge pull request #5443 from raoofm/patch-5
Doc: fix typo in v2-migration.md
2016-05-24 09:35:42 -07:00
5002114127 Doc: fix typo in v2-migration.md 2016-05-24 11:44:40 -04:00
ffd3cb78d4 store: use Rlock when GET 2016-05-24 17:13:29 +08:00
f86dc5c7f7 Merge pull request #5438 from gyuho/proxy_log
proxy/httpproxy: fix v2 proxy log header
2016-05-23 16:49:26 -07:00
340df26883 Merge pull request #5435 from xiang90/cap
api: add v3rpc capability
2016-05-23 15:50:08 -07:00
dd8a36820e proxy/httpproxy: fix v2 proxy log header
Replace all with capnslog
2016-05-23 15:45:49 -07:00
1c544c3ba5 api: add v3rpc capability 2016-05-23 14:45:08 -07:00
663db2bbf8 Merge pull request #5410 from gyuho/e2e_migrate
e2e: test migrate command
2016-05-23 14:42:51 -07:00
23b14a8c8d e2e: add migrate cmd test 2016-05-23 14:27:51 -07:00
96d06d4f2c e2e: add Restart, Start, grpcEndpoints methods 2016-05-23 14:27:48 -07:00
6a8c65cba9 Merge pull request #5436 from gyuho/v3_doc
Documentation: updates for v3 release
2016-05-23 12:29:39 -07:00
fd7685f3a1 Documentation: add clientv3 links to libraries 2016-05-23 12:01:38 -07:00
d57164d0c8 README: throughput number in v3, add Doorman
Our v3 benchmark shows etcd v3 can do 40k writes per second.
1k throughput number is for etcd v2. Also adds YouTube's doorman
to example project lists.
2016-05-23 12:00:03 -07:00
3351ea1ae2 Procfile: v3 as default 2016-05-23 11:59:23 -07:00
ad9d18faa9 Merge pull request #5411 from xiang90/m_doc
doc: add app migration doc
2016-05-23 11:56:34 -07:00
a62e4e1e3a doc: add app migration doc 2016-05-23 11:53:44 -07:00
a3a4f51d90 Merge pull request #5434 from gyuho/log_integration
integration: add logs for debugging
2016-05-23 11:52:08 -07:00
4df91ae755 Merge pull request #5424 from gyuho/slice_pre_alloc
rafthttp: replace append with pre-allocated slice
2016-05-23 11:30:07 -07:00
ddbe46543d integration: add logs for debugging 2016-05-23 11:23:41 -07:00
f20573b576 Merge pull request #5426 from gyuho/log_compaction_done
mvcc: log when compaction is done
2016-05-21 09:33:50 -07:00
bf8cf39daf mvcc: use capnslog 2016-05-20 22:31:22 -07:00
4882330fd7 Merge pull request #5417 from heyitsanthony/watcher-victims
mvcc: reuse watcher batch from notify on blocked watch channel
2016-05-20 19:59:38 -07:00
394ce5f3b8 mvcc: move blocked unsynced watchers to victim list 2016-05-20 15:56:02 -07:00
5984e46364 mvcc: move blocked sync watcher work to victim list
Instead of holding the store lock while doing a lot of work like when syncung
unsynced watchers, the work from a blocked synced notify can be reused and
dispatched without holding the store lock for long.
2016-05-20 15:56:02 -07:00
c9264c5e65 rafthttp: replace append with pre-allocated slice 2016-05-20 15:20:55 -07:00
1226946e2d Merge pull request #5423 from purpleidea/feat/typos3
clientv3: fix typo
2016-05-20 14:45:20 -07:00
374b3ee40b clientv3: fix typo 2016-05-20 17:18:52 -04:00
4c36054610 Merge pull request #5420 from purpleidea/feat/typos2
Fix typos
2016-05-20 11:30:38 -07:00
edca3cbe44 clientv3: Fix typos
Found randomly when going through docs. HTH
2016-05-20 14:06:29 -04:00
0b34b236d6 mvcc: benchmark for synced watchers 2016-05-19 23:31:27 -07:00
751d5fa486 Merge pull request #5414 from swingbach/master
raft: fix tiny mistake of message type
2016-05-19 23:15:15 -07:00
ff9d16a2e0 raft: fix tiny mistake of message type 2016-05-20 14:04:08 +08:00
4ee60d6671 Merge pull request #5413 from mitake/test
test: remove a directory correctly
2016-05-19 21:58:14 -07:00
1727f278f2 test: remove a directory correctly
Current rm in the test script cannot the gopath/src correctly and
results test failure.
2016-05-20 13:42:36 +09:00
e9f3e809a6 Merge pull request #5409 from xiang90/doc
etcdctl: add migrate command into readme
2016-05-19 16:54:10 -07:00
628a38d906 etcdctl: add migrate command into readme 2016-05-19 16:53:47 -07:00
82c6408f38 Merge pull request #5406 from gyuho/clientv3_slice
clientv3/concurrency: preallocate slice in stm
2016-05-19 14:57:19 -07:00
fa1e40c120 clientv3/concurrency: preallocate slice in stm 2016-05-19 14:42:19 -07:00
8c17674cda Merge pull request #5404 from gyuho/watch_optimize
mvcc: remove defer in watchable store
2016-05-19 14:08:37 -07:00
be4fb634a1 Merge pull request #5279 from gyuho/demo
Documentation: add animated quick demo
2016-05-19 14:03:27 -07:00
aa85cf037f mvcc: remove defer in watchable store 2016-05-19 13:51:51 -07:00
54536af135 Merge pull request #5405 from gyuho/watch_client
clientv3: preallocate watch streams slice
2016-05-19 13:21:44 -07:00
f9306fb817 clientv3: preallocate watch streams slice
To avoid slice growth when appending
2016-05-19 12:55:55 -07:00
edb11881f8 Merge pull request #5391 from xiang90/migrate
etcdctl: add migrate command
2016-05-19 12:33:11 -07:00
6f2e7875aa etcdctl: add migrate command
Migrate command accepts a datadir and an optional user-provided
transformer function that transform v2 keys to v2 keys.

Migrate command then builds a v3 backend state based on the existing
v2 keys and the output of the transformer function.
2016-05-19 12:17:15 -07:00
61a7d3efb3 Merge pull request #5392 from gyuho/watch_bench
benchmark: fix watch command
2016-05-19 10:12:24 -07:00
9ca84e814f benchmark: fix watch command
Fix https://github.com/coreos/etcd/issues/5099.
2016-05-19 09:57:35 -07:00
8e4a83c830 Merge pull request #5400 from rkrambovitis/patch-2
doc: fix https omission in documentation.
2016-05-19 08:07:27 -07:00
38ebb6b475 doc: fix https omission in documentation.
doc: added missing (http)s to tls setup guide

This fixes a minor documentation omission, where the 1st initial-advertise-peer-url for tls setup appears to be http.

fixes documentation
2016-05-19 18:04:52 +03:00
9ea181e561 Merge pull request #5388 from swingbach/master
raft: add more assertions on dueling candidates test case
2016-05-19 06:59:35 -07:00
1e54117580 raft: add more comments for dueling candidates test case 2016-05-19 13:51:20 +08:00
c703ccab63 raft: add more assertions for dueling candidates test case 2016-05-19 13:50:14 +08:00
62b4d1cef7 Merge pull request #5394 from heyitsanthony/clientv3-no-close-conn
clientv3: don't reuse closed connection and ignore "transport is closing"
2016-05-18 15:52:21 -07:00
e4a2dcad9e clientv3/integration: ignore closing transport in TestKVPutStoppedServerAndClose
The grpc "transport is closing" error is rasied when the host is unreachable;
there's no good way to avoid it for a Put.

Fixes #5343
2016-05-18 14:49:39 -07:00
782a8802c0 clientv3: avoid reusing closed connection in KV 2016-05-18 14:46:17 -07:00
26783f51b1 Documentation: add animated quick demo 2016-05-18 11:28:27 -07:00
dc073e1aa7 Merge pull request #5383 from gyuho/kvstore_byte_pool
mvcc: use buffer bytes to encode consistent index
2016-05-18 10:32:33 -07:00
77775e8e92 mvcc: preallocate bytes buffer for saveIndex 2016-05-18 10:01:57 -07:00
90498b3756 Merge pull request #5385 from gyuho/fix_backup_test
e2e: wait for member publishing after backup
2016-05-17 21:57:52 -07:00
f2b2e0761a e2e: wait for member publishing after backup 2016-05-17 21:39:04 -07:00
81b4e6d332 Merge pull request #5384 from mitake/genproto
scripts: pass -u to go get in genproto.sh
2016-05-17 20:49:36 -07:00
db9ccb75bf scripts: pass -u to go get in genproto.sh
Current genproto.sh doesn't pass -u option to go get. It is
problematic because the script depends on a specific version of
gogoproto. Actually it causes build error if a repository already have
an old version of gogoproto that doesn't have a specified commit
($SHA). This commit lets the script pass -u to go get for avoid the
error.
2016-05-18 11:38:51 +09:00
7678fc153a Merge pull request #5382 from gyuho/rafthttp_timeout
rafthttp: fix TestSendMessageWhenStreamIsBroken
2016-05-17 16:22:02 -07:00
d20cb40f4f rafthttp: fix TestSendMessageWhenStreamIsBroken
Fix https://github.com/coreos/etcd/issues/5381.

In case CI being slow that taking more than 10ms.
2016-05-17 16:03:54 -07:00
ecf192556e Merge pull request #5380 from gyuho/backup_e2e_test
e2e: v2 backup test
2016-05-17 15:56:24 -07:00
06950e41b4 e2e: v2 backup test
Fix https://github.com/coreos/etcd/issues/5367.
2016-05-17 15:35:39 -07:00
fb8d12a9cd Merge pull request #5379 from heyitsanthony/fix-snapshot-close-wal
etcdserver: wait for snapshots before closing raft
2016-05-17 15:19:41 -07:00
73204e9637 etcdserver: wait for snapshots before closing raft
Fixes #5374
2016-05-17 15:04:25 -07:00
1a06f5dab5 Merge pull request #5359 from mischief/bolt-openbsd
mvcc: set bolt options to nil for non-linux systems
2016-05-17 13:32:37 -07:00
f65331b456 Merge pull request #5376 from gyuho/e2e_typo
e2e: add 'force-new-cluster' flag, fix typo
2016-05-17 13:29:58 -07:00
00a2dca619 Merge pull request #5378 from gyuho/boltdb_update
vendor: update boltdb to v1.2.1
2016-05-17 13:26:29 -07:00
86c85b88ad Merge pull request #5377 from purpleidea/bug/typos
clientv3: fix typos
2016-05-17 12:51:13 -07:00
dd8e81070a e2e: add force-new-cluster flag 2016-05-17 12:48:26 -07:00
63e6228a0b e2e: fix typo(isClientAuthTLS to isClientAutoTLS) 2016-05-17 12:47:21 -07:00
e4e4c9dc2c mvcc: set bolt options to nil for non-linux systems 2016-05-17 12:46:44 -07:00
bc5f626e56 vendor: update boltdb to v1.2.1 2016-05-17 12:42:38 -07:00
42f3b4964f clientv3: fix typos 2016-05-17 15:39:56 -04:00
0269afd643 Merge pull request #5375 from gyuho/admin_guide_typo
Documentation/v2: fix typo for updating a member
2016-05-17 11:47:09 -07:00
e2fe80393e Documentation/v2: fix typo for updating a member
Fix https://github.com/coreos/etcd/issues/5358.
2016-05-17 11:44:39 -07:00
3c78523643 Merge pull request #5373 from gyuho/table-write-out
Documentation: write-out=table for v3 commands
2016-05-17 10:46:50 -07:00
6a0148e214 Documentation: write-out=table for v3 commands 2016-05-17 10:45:18 -07:00
3c8301358c Merge pull request #5371 from gyuho/auth_doc
Documentation/v2: fix auth_api.md bug
2016-05-17 10:22:12 -07:00
21c9da1ed4 Documentation/v2: fix auth_api.md bug
role guest read and write is "/*", not "*", same with other roles.
2016-05-17 09:42:38 -07:00
7014f6861d Merge pull request #5361 from mitake/auth-v3-token-credential
RFC: *: attach auth token as a gRPC credential
2016-05-16 21:45:44 -07:00
6259318521 *: attach auth token as a gRPC credential
This commit adds a functionality of attaching an auth token to gRPC
connection as a per RPC credential.

For doing this, this commit lets clientv3.Client.Dial() create a
dedicated gRPC connection for doing authentication. With the dedicated
connection, the client calls Authenticate() RPC and obtain its
token. The token is attached to the main gRPC connection with
grpc.WithPerRPCCredentials().

This commit also adds a new option --username to etcdctl (v3). With
this option, etcdctl attaches its auth token to the main gRPC
connection (currently it is not used at all).
2016-05-17 13:26:12 +09:00
327b01169c Merge pull request #5353 from heyitsanthony/clientv3-throttle-reconn
clientv3: throttle reconnection rate
2016-05-16 13:41:28 -07:00
f6e5fe6877 Merge pull request #5368 from heyitsanthony/sshot-hash
v3rpc, etcdctl: snapshot integrity hash
2016-05-16 13:09:02 -07:00
798718c49b etcdctl: verify snapshot hash on restore
Fixes #4097
2016-05-16 12:08:08 -07:00
ac2e3e43bf v3rpc: add sha trailer to snapshot 2016-05-16 11:15:03 -07:00
e8101ddf09 clientv3: throttle reconnection rate
Client was reconnecting after establishing connections because the lease
and watch APIs were thrashing. Instead, wait a little before accepting
new reconnect requests.
2016-05-16 11:14:45 -07:00
3c3bb3f97c godep: add golang.org/x/time/rate 2016-05-16 11:14:45 -07:00
a663828a32 Merge pull request #5366 from xiang90/fix_restore
raft: do not panic when removing all the nodes from cluster
2016-05-16 10:45:48 -07:00
29c77dee74 Merge pull request #5298 from purpleidea/feat/newurlsmap
pkg/types: Build a urls map from a string map
2016-05-16 10:39:14 -07:00
8ffbaef502 Merge pull request #5364 from heyitsanthony/fix-election-wait
integration: fix TestElectionWait
2016-05-16 10:30:17 -07:00
e52fc2d07e Merge pull request #5363 from heyitsanthony/fix-test-wait
test: fix wait on integration tests
2016-05-16 10:28:45 -07:00
910781ef5b raft: do not panic when removing all the nodes from cluster 2016-05-16 10:04:17 -07:00
c21b885dd5 integration: fix TestElectionWait
elections are now per-session so waiting on the same election with the
same client will not block like before

Fixes #5362
2016-05-16 07:32:42 -07:00
e312bb675c test: fix wait on integration tests
Typo was causing failed tests to look like they passed on CI.
2016-05-16 06:32:38 -07:00
46481b17fc Merge pull request #5356 from xiang90/grpc-proxy
proxy: initial grpc kv service proxy
2016-05-14 12:31:06 -07:00
2d3a8541d0 Merge pull request #5355 from heyitsanthony/cluster-security-doc
doc: add TLS examples to clustering guide
2016-05-14 10:44:06 -07:00
d41ce0a97c pkg/types: Add tests for NewURLsMapFromStringMap 2016-05-14 10:48:56 -04:00
17e23769d9 pkg/types: gofmt existing code 2016-05-14 09:33:58 -04:00
029fe6bf47 pkg/types: Build a urls map from a string map
This adds a simple transformation function which is helpful when
manipulating the different etcd internal data representations.
2016-05-14 09:33:58 -04:00
ec2ac72585 proxy: initial grpc kv service proxy 2016-05-13 23:00:29 -07:00
25850e0070 doc: add TLS examples to clustering guide
Fixes #3595
2016-05-13 17:10:41 -07:00
deb21d3da5 Merge pull request #5352 from xiang90/p
integration: remove parallel testing
2016-05-13 13:24:36 -07:00
410c5cd828 Merge pull request #5351 from gyuho/allow_null_key
etcdctl/ctlv3: allow empty key
2016-05-13 12:26:59 -07:00
c7c0e1eb7a integration: remove parallel testing
We cannot do testing in parallel since leak testing will detect the goroutines
in other tests running in parallel.
2016-05-13 12:01:25 -07:00
002090daec e2e: test empty key for get command 2016-05-13 11:30:36 -07:00
3ec627d1a8 etcdctl/ctlv3: allow empty key
Fix https://github.com/coreos/etcd/issues/5323.
2016-05-13 11:29:58 -07:00
8c953499fa Merge pull request #5349 from heyitsanthony/clientv3-conc-fixups
clientv3/concurrency: ctx-izations and session leader ids
2016-05-13 10:33:55 -07:00
120020fa9c clientv3/concurrency: use session id for election keys to avoid deadlock 2016-05-13 10:07:35 -07:00
393725fe5f clientv3/concurrency: ctx-ize Leader(), Resign(), and Unlock() 2016-05-13 10:07:35 -07:00
2e93c65c96 bridge: fix command line flag handling
flag package expects flags in Argv[1:] and stops on non-flag arguments
but bridge was expecting the forwarding address in os.Argv[1]
2016-05-13 10:07:35 -07:00
4d2424210f Merge pull request #5313 from xiang90/fix_raft_abort
raft: simplify leadership transfer
2016-05-13 09:26:01 -07:00
4612e2d59a Merge pull request #5340 from heyitsanthony/etcd-runner-election
etcd-runner: election mode
2016-05-12 22:53:35 -07:00
4fe91ed1e2 etcd-runner: election mode 2016-05-12 22:32:33 -07:00
215afb9b1d etcd-runner: refactor round code 2016-05-12 22:32:33 -07:00
66e5e4f298 Merge pull request #5344 from gyuho/license_authors
*: update LICENSE header
2016-05-12 21:18:35 -07:00
71e6c4b06a .header: update to 'etcd Authors' 2016-05-12 20:56:50 -07:00
ef44f71da9 *: update LICENSE header 2016-05-12 20:51:48 -07:00
c538e0f9a9 etcdctl: update LICENSE header 2016-05-12 20:51:39 -07:00
2a44b9636a auth: update LICENSE header 2016-05-12 20:51:14 -07:00
fd9e07a529 clientv3: update LICENSE header 2016-05-12 20:50:58 -07:00
9d9f02c1ee mvcc: update LICENSE header 2016-05-12 20:50:33 -07:00
3d523e34b1 tools: update LICENSE header 2016-05-12 20:50:17 -07:00
4a5befc2de wal: update LICENSE header 2016-05-12 20:50:04 -07:00
abb4cd5646 etcdserver: update LICENSE header 2016-05-12 20:49:40 -07:00
bd71a60875 rafthttp: update LICENSE header 2016-05-12 20:49:28 -07:00
fe884f8209 raft: update LICENSE header 2016-05-12 20:49:15 -07:00
8b77de4e99 pkg: update LICENSE header 2016-05-12 20:48:53 -07:00
a880e9c7cb Merge pull request #5332 from xiang90/sl
*: cancel required leader streams when memeber lost its leader
2016-05-12 20:24:34 -07:00
15c5259e2d Merge pull request #5328 from gyuho/require_leader
requireHasLeader client side
2016-05-12 19:53:43 -07:00
9c103dd0de *: cancel required leader streams when memeber lost its leader 2016-05-12 19:42:21 -07:00
68eaf4083a clientv3: WithRequireLeader 2016-05-12 19:25:42 -07:00
431c4e7b3b Merge pull request #5342 from gyuho/grpc_dep
cmd/vendor: update grpc (upstream)
2016-05-12 19:23:31 -07:00
711be0a567 cmd/vendor: update grpc (upstream) 2016-05-12 19:02:30 -07:00
f4d1501198 Merge pull request #5337 from gyuho/configurable_monitor_interval
etcdmain: gateway monitor-interval flag
2016-05-12 18:58:52 -07:00
a32aabc377 proxy/tcpproxy: add more logs 2016-05-12 17:48:36 -07:00
750273afd9 Merge pull request #5339 from gyuho/protodoc_fix
*: fix protodoc, consistent casing in api doc
2016-05-12 17:39:06 -07:00
78d46b71fa Merge pull request #5336 from heyitsanthony/fix-clientv3-failput-close-crash
clientv3: fix Close after failed Put
2016-05-12 17:32:49 -07:00
9a6daefb3e etcdmain: add retry-delay flag 2016-05-12 17:03:00 -07:00
62e5ffac13 Merge pull request #5338 from gyuho/proxy_log
httpproxy: fix capnslog log path
2016-05-12 16:58:32 -07:00
b1f95c314b *: fix protodoc, consistent casing in api doc
There was a bug in protodoc.
This changes git SHA to use the latest protodoc.
And make the letter casing consistent with original
Protocol Buffer. Go capitalizes the member variables,
but the protocol buffer documentation should be same as
original proto files.
2016-05-12 16:23:29 -07:00
527aa1a499 clientv3: fix Close after failed Put
Was crashing on a nil connection. Reworked the shutdown path a little so
there's only one connection close site.
2016-05-12 16:16:27 -07:00
25d9169e9a httpproxy: fix capnslog log path
We changed the package path, so log paths needs to be updated as well.
2016-05-12 15:56:40 -07:00
fb65d04291 Merge pull request #5329 from gyuho/typo_integration
integration: fix NewClientV3 error messages
2016-05-12 10:49:14 -07:00
78ae4b92a6 integration: fix NewClientV3 error messages 2016-05-12 10:26:27 -07:00
2e011053b1 Merge pull request #5326 from mortonfox/patch-1
README: Update link to configuration.md
2016-05-11 22:33:54 -07:00
9c05f92f2e README: Update link to configuration.md
The file, along with all other documentation files, has moved into the Documentation folder.
2016-05-12 00:57:30 -04:00
9acb7ab41c Merge pull request #5325 from heyitsanthony/fix-partial-wal-init
wal: atomically initialize wal directory
2016-05-11 18:01:04 -07:00
6fc3106e68 Merge pull request #5324 from xiang90/partitioned
*: etcd member rejects unary call with leader requirement when it does not have leader
2016-05-11 17:48:06 -07:00
17391336af wal: atomically initialize wal directory
Fixes #5270
2016-05-11 16:50:17 -07:00
19221b33cc *: etcd member rejects unary call with leader requirement when it does not have leader 2016-05-11 16:34:34 -07:00
be0c38ec2b Merge pull request #5322 from heyitsanthony/port-docs
scrub legacy ports and update tls information
2016-05-11 16:32:45 -07:00
dcb3b7aecf *: scrub legacy ports from code and scripts 2016-05-11 13:46:30 -07:00
db8f5771f1 doc: scrub legacy ports and TLS information for v3 2016-05-11 13:46:29 -07:00
b03a2f0323 Merge pull request #5318 from heyitsanthony/watcher-latency
batch watcher sync to reduce request latency
2016-05-11 12:53:20 -07:00
080272be17 mvcc: limit total watchers synced per sync
Fixes #4567
2016-05-11 11:16:43 -07:00
f5165a0149 benchmark: make number of watcher streams configurable in watch-get
Each stream uses a client goroutine and a grpc stream; the setup causes
considerable client-side latency on the first get requests.
2016-05-11 11:16:43 -07:00
2aa4dd52cc benchmark: use separate connection for get in watch-get
The watcher traffic interferes with the get latency when sharing connections.
2016-05-11 11:16:43 -07:00
ca105a1c89 Merge pull request #5319 from xiang90/fix_rafthttp_test
*: fix TestTransportErrorc
2016-05-11 11:01:43 -07:00
e90313c9c2 Merge pull request #5321 from gyuho/doc_fix
*: fix minor typos
2016-05-11 10:58:36 -07:00
3104507eb2 *: fix minor typos 2016-05-11 10:55:38 -07:00
b2eb90024f Merge pull request #5320 from gyuho/issue518
v2/README: add known bugs
2016-05-11 10:45:40 -07:00
aaefd52afa Merge pull request #5092 from xiang90/etcdlet
*: gateway initial commit
2016-05-11 10:36:02 -07:00
5023996d02 v2/README: add known bugs
For https://github.com/coreos/etcd/issues/518.
2016-05-11 10:35:41 -07:00
00b660cc53 Merge pull request #5309 from xiang90/d_metrics
*: add disk operation metrics for monitoring
2016-05-11 10:18:39 -07:00
4d0f474034 *: fix TestTransportErrorc
CI can be slow. We should just wait longer.
2016-05-11 10:09:40 -07:00
a300be92dc *: initial support for gatway
etcd gatway is a simple l4 gateway that forwards tcp connections to
the given endpoints.
2016-05-11 09:44:50 -07:00
0fb7cb8b00 *: add disk operation metrics for monitoring 2016-05-11 09:36:45 -07:00
34b0736f2c mvcc: Reduce number of allocs when watchableStore if no watchers.
When there are no watchers the number of allocations made while handling
a PUT operation can be reduced by exiting early.
2016-05-11 00:51:00 -07:00
5ddb532072 Merge pull request #5314 from gyuho/test-script
test: fix typo, clean-up print statements
2016-05-10 23:51:05 -07:00
fd7e2b20b0 test: fix typo, clean-up print statements 2016-05-10 23:05:58 -07:00
82a6de8b69 raft: simplify leadership transfer 2016-05-10 20:03:42 -07:00
62d4c6d357 Merge pull request #5312 from ajityagaty/backup
etcdctl: Add --wal-dir and --backup-wal-dir options to backup command.
2016-05-10 19:51:30 -07:00
23f9d72870 etcdctl: Add --wal-dir and --backup-wal-dir options to backup command.
If the WAL is stored in a separate directory then the backup command
would need a --wal-dir option to pick the path to the WAL directory.
The user might also want to store the backup of data and wal separately
for which --backup-wal-dir option is provided.
2016-05-10 18:38:56 -07:00
d8215c8892 Merge pull request #5310 from gyuho/timeout_v2
etcdctl/ctlv2: total-timeout for Sync
2016-05-10 15:02:33 -07:00
62a9209088 etcdctl/ctlv2: total-timeout for Sync
Fix https://github.com/coreos/etcd/issues/4897.
2016-05-10 14:20:05 -07:00
6b2d7f9412 Merge pull request #5308 from heyitsanthony/fix-init-notify
etcdmain: notify systemd when etcd is ready to accept requests
2016-05-10 13:55:06 -07:00
8c4958dd60 etcdmain: notify systemd when etcd is ready to accept requests
Fixes #5151
2016-05-10 13:36:46 -07:00
5cbd8cefc9 Merge pull request #5291 from xiang90/c_i
*: add proposalsCommitted metrics
2016-05-10 12:51:28 -07:00
ab11415d25 *: add proposalsCommitted metrics 2016-05-10 10:56:25 -07:00
dad1197c89 Merge pull request #5303 from heyitsanthony/bench-watch-unsync
benchmark: watch-get for testing unsynced watcher/get contention
2016-05-10 10:31:45 -07:00
467de8cb4f benchmark: watch-get for testing unsynced watcher/get contention 2016-05-10 10:24:40 -07:00
efcba23d21 Merge pull request #5301 from gyuho/simple_member
etcdctl/ctlv3: make 'table' printer configurable
2016-05-10 10:12:54 -07:00
3e088b3b40 etcdctl/ctlv3: make 'table' printer configurable
Fix https://github.com/coreos/etcd/issues/5296.
2016-05-10 10:02:02 -07:00
8daad8e06e Merge pull request #5305 from ajityagaty/conf_file
Doc: Add the new '--config-file' detail to configuration.md file
2016-05-10 07:58:55 -07:00
97a2ebe3a2 Doc: Add the new '--config-file' detail to configuration.md file
Add a description about the --config-file option into the
configuration.md file.
2016-05-10 07:50:37 -07:00
fa6670488d Merge pull request #5302 from xiang90/conf-file
*: move sample config file to root directory
2016-05-10 07:46:38 -07:00
4ae47ad934 Merge pull request #5294 from xiang90/r_metrics
*: simplify network metrics
2016-05-09 22:50:45 -07:00
98dbdd5fbb *: simplify network metrics 2016-05-09 22:37:12 -07:00
00398ec98d *: move sample config file to root directory 2016-05-09 21:36:09 -07:00
07c04c7c75 Merge pull request #5280 from ajityagaty/server_config_file
etcd: Configuration file for etcd server.
2016-05-09 19:52:09 -07:00
8bc5ab9f8d etcd: Configuration file for etcd server.
Added a new command line option to etcd server to read in a YAML
based configuration file. I've also added an example configuration
file with comments and a set of test cases.
2016-05-09 18:17:27 -07:00
0d43a2b7e7 Merge pull request #5295 from ajityagaty/auth_disable
auth: Adding support for "auth disable" command.
2016-05-07 23:09:37 -07:00
adc981c53d auth: Adding support for "auth disable" command.
Added support for the auth disable command in the server, added the
etcdctl command and a respective testcase.
2016-05-07 19:21:49 -07:00
53491aac0a Merge pull request #5250 from heyitsanthony/fix-wal-write-tear
wal: repair torn writes
2016-05-06 17:14:56 -07:00
cd9e6a1d4f wal: lock WAL file while repairing 2016-05-06 16:57:55 -07:00
774030e1b2 wal: repair torn writes
Fixes #5230
2016-05-06 16:54:08 -07:00
c9c2cdfeaf Merge pull request #5293 from heyitsanthony/fix-compact-cancel-crash
etcdserver: fix nil dereference in physical Compact on proposal timeout
2016-05-06 16:25:03 -07:00
824ffded12 etcdserver: fix nil dereference in physical Compact on proposal timeout
Fixes #5292
2016-05-06 15:38:18 -07:00
34fbec118a Merge pull request #5289 from xiang90/has_leader_metrics
*: add has leader metrics
2016-05-06 14:45:49 -07:00
4481016953 Merge pull request #5290 from coreos/vv
*: bump to v3.0.0-beta.0+git
2016-05-06 14:10:52 -07:00
ebaa54bf6e *: bump to v3.0.0-beta.0+git 2016-05-06 14:04:01 -07:00
824478be5f *: add has leader metrics 2016-05-06 13:59:19 -07:00
ffd1fa6f52 Merge pull request #5288 from gyuho/version_bump
*: bump to 3.0.0-beta.0
2016-05-06 13:29:19 -07:00
faca29fc3b Merge pull request #5287 from xiang90/l_metrics
*: add leader changes to metrics
2016-05-06 13:27:08 -07:00
76d073a2b5 *: add leader changes to metrics 2016-05-06 13:12:20 -07:00
74ea9ea5cd *: bump to 3.0.0-beta.0 2016-05-06 13:09:50 -07:00
d17aaae714 Merge pull request #5265 from gyuho/fix_5246
v2http: allow empty role for GET '/users'
2016-05-06 11:58:21 -07:00
3c2d0a229c v2http: allow empty role for GET /users
Fix https://github.com/coreos/etcd/issues/5246.
2016-05-06 11:39:38 -07:00
879cfe7666 Merge pull request #5278 from heyitsanthony/fix-clientv3-disconnects
clientv3: fix disconnect breakage
2016-05-05 19:53:08 -07:00
712090fc09 clientv3: keep watcher client active if reconnect has network error
Otherwise watchers created after a long disconnect period will always
close immediately.
2016-05-05 19:30:11 -07:00
22c3a439bc clientv3: do not stop lease client on lost receive stream
Fixes #5242
2016-05-05 19:30:11 -07:00
cdc8f99658 clientv3: rework reconnection logic
Avoids go routine flood for tight loops with a dead connection.
Now uses request ctx when reconnecting for immediate retry.
2016-05-05 19:30:11 -07:00
cc37632003 Merge pull request #5285 from heyitsanthony/fix-windows-sha
build: set git sha on windows builds
2016-05-05 18:39:36 -07:00
5d86525230 build: set git sha on windows builds 2016-05-05 18:18:07 -07:00
93d84b9076 Merge pull request #5284 from xiang90/perf_doc
doc: add performance.md
2016-05-05 16:02:39 -07:00
b033167094 doc: add performance.md 2016-05-05 14:58:34 -07:00
98031a3b6e Merge pull request #5249 from xiang90/metrics
*: add metrics for grpc api
2016-05-05 14:19:46 -07:00
063307ec0a *: add metrics for grpc api 2016-05-05 13:45:52 -07:00
61add11b05 Merge pull request #5259 from gyuho/functional-test
etcd-tester: refactor
2016-05-05 11:15:18 -07:00
cc7dd9b729 etcd-tester: refactor 2016-05-05 10:55:42 -07:00
3bcd2b5b9f Merge pull request #5271 from heyitsanthony/fix-rafthttp-active-race
rafthttp: fix race on peer status activeSince
2016-05-04 13:49:58 -07:00
c5af1d7a88 rafthttp: fix race on peer status activeSince 2016-05-04 11:48:16 -07:00
b24d0032d2 Merge pull request #5269 from heyitsanthony/fix-httpproxy-race
httpproxy: fix race on getting close notifier channel
2016-05-04 09:49:19 -07:00
a76f5f5ed2 httpproxy: fix race on getting close notifier channel
Fixes #5267
2016-05-04 09:32:26 -07:00
53ed8750ce Merge pull request #5266 from maciej/scala_etcd_client
librarites-and-tools.md: add Scala-based maciej/etcd-client
2016-05-04 09:15:18 -07:00
aeff5507e6 librarites-and-tools.md: add Scala-based maciej/etcd-client 2016-05-04 02:56:17 +02:00
b7761530e1 Merge pull request #5251 from heyitsanthony/fix-watch-panic
clientv3: gracefully handle watcher resume on compacted revision
2016-05-03 15:00:39 -07:00
b53aaf4c82 Merge pull request #5262 from gyuho/more_logging
*: more detailed timeout logging
2016-05-03 14:13:46 -07:00
9bf601a921 etcdserver: log timeout 2016-05-03 13:39:31 -07:00
0f5b8c39b4 Merge pull request #5263 from gyuho/autotls-flag
etcdmain: add auto-tls flag to help.go
2016-05-03 13:14:23 -07:00
56dd991b4e etcdmain: add auto-tls flag to help.go 2016-05-03 12:40:02 -07:00
864cbd36bf Merge pull request #5261 from gyuho/typo
*: typo, remove string type assertions
2016-05-03 11:29:13 -07:00
1a0d1ab4ab Merge pull request #5260 from glevand/for-merge-build
build: Simplify host detection
2016-05-03 11:28:46 -07:00
a288188001 *: typo, remove string type assertions 2016-05-03 10:59:57 -07:00
5d8d684a91 Merge pull request #5257 from gyuho/proto_fix
*: fix protodoc, re-run genproto script, typos in proto files
2016-05-03 10:46:46 -07:00
fd27f9cd28 Merge pull request #5256 from gyuho/fix_build
build: set GitSHA version in cmd directory
2016-05-03 10:13:07 -07:00
4ecb560604 build: Simplify host detection
Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-05-03 09:54:44 -07:00
8b52fd0d2d clientv3: gracefully handle watcher resume on compacted revision
Fixes #5239
2016-05-03 09:30:53 -07:00
b7639b00e0 Merge pull request #5252 from xiang90/client-tls
*: support auto tls on client side
2016-05-03 09:22:11 -07:00
c5bf6a9d9e e2e: add test for auto client tls 2016-05-03 08:35:02 -07:00
015acabdbb *: rerun genproto -g 2016-05-02 23:02:31 -07:00
6222d46233 scripts/genproto.sh: update protodoc git SHA
To use protodoc with the fix
58fed2ed06.

This correctly parses the order of values in 'directories' flag.
2016-05-02 23:00:40 -07:00
36acde620e build: set GitSHA version in cmd directory
Fix https://github.com/coreos/etcd/issues/5255.
2016-05-02 22:16:40 -07:00
1f5c5abe6d Merge pull request #5253 from xiang90/fix_raft_test
raft: fix flaky test
2016-05-02 21:33:51 -07:00
2fa5b913fe raft: fix flaky test
We recently changed the randomized election timeout from (et, 2*et-1] tp
[et, 2*et-2], where et is user set election timeout.

So 2*et might trigger two elections instead of one. We need to fix the test
code accordingly.

Thanks for Tikv guys for finding this issue. We probably need to randomize
etcd/raft test more.
2016-05-02 21:08:19 -07:00
973ad5aa7c *: support auto tls on client side 2016-05-02 16:17:49 -07:00
fee71b18a3 Merge pull request #5248 from gyuho/hash_with_revision
functional-tester: use revision from hash method
2016-05-02 15:30:26 -07:00
064c1ff0f3 etcdserver/api/v3rpc: use Revision from Hash API 2016-05-02 15:06:39 -07:00
7a6d9ea01a mvcc: Hash to return Revision 2016-05-02 15:04:24 -07:00
a8139e2b0e Merge pull request #5247 from joshix/faqhead
Documentation/v2: Add newline before heading in faq.md
2016-05-02 11:43:58 -07:00
92d673ea59 Documentation/v2: Add newline before heading in faq.md
Minor rewrite to the heading text for clarity.

Matches downstream coreos-inc/coreos-pages#648 and
coreos-inc/coreos-pages#649.
2016-05-02 11:15:18 -07:00
b9ea5f6d90 Merge pull request #5241 from claws/avoid_differences_in_gnu_and_bsd_cut
use sed instead of cut to accomodate GNU and BSD differences
2016-04-30 20:58:57 -07:00
c071104fc4 script: fix build script regression to work on OSX
Use sed instead of cut to accomodate GNU and BSD differences

Fixes: #5240
2016-05-01 13:06:07 +09:30
28f3cb0f14 Merge pull request #5171 from xiang90/runner
etcd-runner: initial commit
2016-04-30 19:39:53 -07:00
73ecb61ff4 etcd-runner: initial commit 2016-04-30 17:24:03 -07:00
262de75a7e Merge pull request #5238 from xiang90/bench_watch_put
mvcc: add benchmark for watch put and improve it
2016-04-30 10:46:38 -07:00
ad327e01d0 mvcc: add benchmark for watch put and improve it 2016-04-29 19:58:37 -07:00
b58f8dd64b Merge pull request #5237 from brian-brazil/master
Improve some debug metrics.
2016-04-29 17:53:54 -07:00
ea1d0f3e0d etcdserver: Improve some debug metrics.
The _total suffix is by convention for counters,
don't use it on a gauge. Clarify help string.
Tweak metric name so it'll sort with related metrics,
and be a little more understandable.

Remove open file descriptor metric, as Prometheus client_golang
provides that out of the box as process_open_fds which is also
more up to date. Both only support Linux, so there's no loss of
platform support.

Fixes #5229
2016-04-30 01:29:13 +01:00
c89e348fbc Merge pull request #5232 from glevand/master
Add arm64 travis builds
2016-04-29 16:41:09 -07:00
552a5af10f Merge pull request #5236 from gyuho/wait_purge
pkg/fileutil: wait up to 300ms for purge test
2016-04-29 16:37:09 -07:00
b79bb6f164 travis: Enable arm64 builds
Setup a travis test matrix on a new variable 'TARGET', which specifies the CI
target.  Update the script section with a conditional that runs the needed
commands for each target.

Also, set go_import_path to make cloned repos work, enable the trusty VM, and
enable verbose builds when testing.

Signed-off-by: Geoff Levand <geoff@infradead.org>
2016-04-29 15:31:30 -07:00
4ab1500a6d pkg/fileutil: wait up to 300ms for purge test
Fix https://github.com/coreos/etcd/issues/5231.

The issue shows that slow CI can take more than 200ms
for purging. This increase the loop iteration to wait
up to 300ms in case the disk is being slow.
2016-04-29 15:24:44 -07:00
00d6f104b5 Merge pull request #5235 from ronabop/get_put_typo
simple typo in README docs for getting started
2016-04-29 14:24:21 -07:00
c97b74a72f doc: fix etcdctl example in README
repeated put rather than put followed by get
2016-04-29 21:20:14 +00:00
634a9e833e Merge pull request #5233 from gyuho/client-doc
clientv3: fix README, add error handling example
2016-04-29 13:56:37 -07:00
0c5bcd5d80 clientv3: fix README, add error handling example 2016-04-29 13:34:16 -07:00
33968059e9 Merge pull request #5222 from gyuho/error_interface
rpctypes: error interface
2016-04-29 13:01:54 -07:00
ec1fdd3938 integration: test with new server errors 2016-04-29 12:00:26 -07:00
b3ebe66c97 clientv3/integration: tests with new errors 2016-04-29 12:00:26 -07:00
6049c95dc9 clientv3: auth with rpctypes.Error 2016-04-29 12:00:26 -07:00
506cf1f03f etcdserver/api/v3rpc: use new errors 2016-04-29 12:00:26 -07:00
2b361cf06b rpctypes: define a new error interface 2016-04-29 12:00:22 -07:00
d893a78c38 test: add v3rpc, rpctypes 2016-04-29 11:00:02 -07:00
8e099ab713 Merge pull request #5225 from heyitsanthony/local-tester
local-tester: procfile, faults, and network bridge
2016-04-29 10:27:56 -07:00
b8850cec93 Merge pull request #5228 from xiang90/fix_d
mvcc: fix watch deleteRange
2016-04-29 10:23:46 -07:00
29eca4eb88 Merge pull request #5223 from heyitsanthony/kv-less-reconnect
clientv3: better serialization for kv and txn connection retry
2016-04-29 10:02:17 -07:00
c0ff77e809 local-tester: procfile, faults, and network bridge
Creates a local fault injected cluster and stresser for etcd.

Usage: goreman -f tools/local-tester/Procfile start
2016-04-29 09:57:02 -07:00
3ddcc21179 mvcc: fix watch deleteRange 2016-04-29 09:40:28 -07:00
c26eb3f241 clientv3: better serialization for kv and txn connection retry
If the grpc connection is restored between an rpc network failure
and trying to reestablish the connection, the connection retry would
end up resetting good connections if many operations were
in-flight at the time of network failure.
2016-04-29 09:26:32 -07:00
60425de0ff Merge pull request #5227 from raoofm/patch-3
Doc: Update production-users.md
2016-04-29 08:25:07 -07:00
db8588ab93 Doc: Update production-users.md
Update the Backups policy
2016-04-29 11:23:51 -04:00
51ad5f00bf Merge pull request #5226 from raoofm/patch-2
Doc: Update production-users.md
2016-04-29 08:01:07 -07:00
419ae757d2 Update production-users.md 2016-04-29 10:58:24 -04:00
4480eb6d49 Merge pull request #5217 from gyuho/rpc_types
*: return rpctypes.Err in clientv3
2016-04-28 15:58:47 -07:00
f148f4b2b9 clientv3/integration: tests error types (rpctypes) 2016-04-28 15:42:27 -07:00
2e3d79a7bf clientv3: convert errors to rpctypes on returning
For https://github.com/coreos/etcd/issues/5211.
2016-04-28 15:39:37 -07:00
f613052435 rpctypes: Error function to convert clientv3 error 2016-04-28 12:16:13 -07:00
bef5be42b5 integration: add quota backend bytes option 2016-04-28 12:15:31 -07:00
11ec94b7e8 Merge pull request #5218 from heyitsanthony/fix-issue-3699
integration: wait for ReadyNotify in Issue3699 test
2016-04-28 10:48:08 -07:00
7c666b533a Merge pull request #5221 from heyitsanthony/parallel-e2e-integration
test: run e2e and integration tests in parallel
2016-04-28 10:30:40 -07:00
85edd66c65 test: run e2e and integration tests in parallel 2016-04-28 10:17:40 -07:00
8291110049 rafthttp: do not create new connections after stopping transport 2016-04-28 10:10:52 -07:00
d1e11842df Merge pull request #5219 from xiang90/req_timeout
etcdserver: add timeout for processing v3 request
2016-04-28 09:25:08 -07:00
6ee5f9c677 etcdserver: add timeout for processing v3 request 2016-04-28 08:52:17 -07:00
d814e9dc35 integration: wait for ReadyNotify in Issue3699 test
Fixes #5147
2016-04-27 22:04:07 -07:00
8df52dc6fa Merge pull request #5216 from heyitsanthony/lease-header-err
v3rpc: only fill lease grant header if no error
2016-04-27 16:51:16 -07:00
06ea8aee11 v3rpc: only fill lease grant header if no error
Was panicking under cluster fault injection.
2016-04-27 16:28:40 -07:00
ca83793876 Merge pull request #5169 from xiang90/ready
etcdserver: do not serve requests before finish the first internal proposal
2016-04-27 16:05:12 -07:00
434f2c356d etcdserver: do not serve requests before finish the first internal proposal 2016-04-27 15:46:31 -07:00
e50df7c19b Merge pull request #5215 from gyuho/finish_doc
Finish v2 documentation cleaning
2016-04-27 14:07:59 -07:00
c697aa7c60 Documentation: remove the rest
Remove:
1. auth_api.md
2. docker_guide.md
3. faq.md
4. implementation-faq.md
5. internal-protocol-versioning.md
2016-04-27 13:48:11 -07:00
8b3d1562f9 Documentation: remove admin_guide out of v2 2016-04-27 13:48:07 -07:00
c25c8573ac Merge pull request #5212 from gyuho/doc_fix
v2 documentation link fix
2016-04-27 13:18:39 -07:00
954535c2b4 Documentation: move members_api.md 2016-04-27 11:49:41 -07:00
42c09a95a0 Documentation: remove other_apis from v3 2016-04-27 11:40:48 -07:00
a2ab18fce5 Documentation: move api.md to v2 2016-04-27 11:40:48 -07:00
5464665107 Documentation: del backward_compatibility from v3 2016-04-27 11:40:48 -07:00
04fda9d25f Documentation: fix proxy link and delete from v3 2016-04-27 11:40:44 -07:00
95bac2dc3c Documentation: remove v2 snapshot migration doc 2016-04-27 11:31:44 -07:00
01927cc26a *: remove v2 specific authentication doc 2016-04-27 11:30:51 -07:00
f4b8e878ed Documentation: delete upgrade_2_* from v3 doc dir 2016-04-27 11:29:36 -07:00
63c5725fef Documentation: fix errorcode link to v2 2016-04-27 11:28:48 -07:00
afd2cc7373 Merge pull request #5206 from xiang90/lease_header
v3rpc: fill lease header
2016-04-27 11:18:00 -07:00
08f6c0775a Merge pull request #5199 from heyitsanthony/safe-lock-retry
clientv3/concurrency: use session lease id for mutex keys
2016-04-27 11:10:46 -07:00
07daa9fdc0 Merge pull request #5201 from gyuho/auth_test
auth: add basic tests
2016-04-27 10:57:20 -07:00
c3de53c23c v3rpc: fill lease header 2016-04-27 10:30:23 -07:00
14415c2187 auth: add tests 2016-04-27 10:13:36 -07:00
81ac766bb4 Merge pull request #5174 from gyuho/restart
etcd-tester: match more grpc errors
2016-04-27 09:47:55 -07:00
de7c18909f etcd-tester: match more grpc errors
To prevent stressers from returning from failure injections
2016-04-27 09:34:05 -07:00
8a4c9c9da1 Merge pull request #5205 from clearbit/rh-error-newline
etcdctl: Add a newline so that errors don't bleed into each other.
2016-04-27 07:31:08 -07:00
a00be40db2 etcdctl: Add a newline so that errors don't bleed into each other. 2016-04-27 14:25:57 +01:00
ecb0e2bd38 Merge pull request #5203 from heyitsanthony/fix-lease-leak
clientv3: check stream context in lease keep alive send loop
2016-04-26 20:42:31 -07:00
30a9229f38 clientv3: check stream context in lease keep alive send loop
If no leases are being kept alive, a connection reset would leak
the send routine since it would only test the stream when sending
keep alives.

Fixes #5200
2016-04-26 20:10:09 -07:00
22797c7185 clientv3/concurrency: use session lease id for mutex keys
With randomized keys, if the connection goes down, but the session remains,
the client would need complicated recovery logic to avoid deadlock.
Instead, bind the session's lease id to the lock entry; if a session tries
to reacquire the lock it will reassume its old place in the wait list.
2016-04-26 17:37:26 -07:00
c8ab6c348a Merge pull request #5196 from gyuho/password_check
etcdserver/auth: check empty password
2016-04-26 15:56:17 -07:00
bba08f6f79 e2e: add tests for issue 5182
For https://github.com/coreos/etcd/issues/5182.
2016-04-26 15:37:19 -07:00
07685bcf97 etcdserver/auth: check empty password in merge
Fix https://github.com/coreos/etcd/issues/5182.
2016-04-26 15:37:15 -07:00
78c96e893e Merge pull request #5198 from heyitsanthony/readme-3.0
doc: focus on v3 in README
2016-04-26 14:47:22 -07:00
dc55c312b0 doc: focus on v3 in README and clone old v2 docs
Fixes #5192
2016-04-26 14:41:59 -07:00
ce76c28805 Merge pull request #5197 from heyitsanthony/fix-lease-revoke-keepalive
etcdserver: respond with ttl=0 for revoked lease keep alive
2016-04-26 14:13:54 -07:00
af1a0b60e2 etcdserver: respond with ttl=0 for revoked lease keep alive
Fixes #5172
2016-04-26 13:53:20 -07:00
26e52d2bce Merge pull request #5190 from xiang90/deb_metrics
*: add debugging metrics
2016-04-26 10:27:05 -07:00
67645095e9 *: add debugging metrics 2016-04-26 09:52:56 -07:00
7161eeed8b Merge pull request #5191 from xiang90/github-folder
.github: add pull request and issue template
2016-04-25 16:22:48 -07:00
cc27c3a1e6 .github: add pull request and issue template 2016-04-25 16:22:13 -07:00
d923b59190 Merge pull request #5189 from heyitsanthony/storage-to-mvcc
*: rename storage package to mvcc
2016-04-25 15:52:08 -07:00
b7ac758969 *: rename storage package to mvcc 2016-04-25 15:25:51 -07:00
1440007608 Merge pull request #5187 from xiang90/doc_security
doc: add link to security
2016-04-25 14:32:12 -07:00
1d5bfd95dc Merge pull request #5188 from gyuho/gogoproto-dependency
Update gogo/proto, grpc dependency
2016-04-25 14:29:42 -07:00
12d01bb1eb vendor: update grpc, gogo/protobuf 2016-04-25 14:10:58 -07:00
4b31acf0e0 *: update generated Proto 2016-04-25 14:08:33 -07:00
82ef33a8d3 scripts: update genproto with new gogoproto hash 2016-04-25 14:07:40 -07:00
4b296bf51c doc: add link to security 2016-04-25 13:54:38 -07:00
9ec176a9b0 Merge pull request #5176 from xiang90/lease_client
clientv3: keepaliveonce should have a per call ctx
2016-04-25 11:45:58 -07:00
6de5b45b2f Merge pull request #5185 from joshix/dochds
Documentation/doc.md: Make headings boring :)
2016-04-25 11:45:20 -07:00
2a38cb5ad8 Documentation/doc.md: Make headings boring :)
Make the heading sentences that introduce each list of documents
a little more standard language, remove implied 2nd person, reduce
exclamations.
2016-04-25 10:58:25 -07:00
8a82ddadb9 Merge pull request #5181 from xiang90/cluster_doc
docs: move clustering doc
2016-04-25 10:50:29 -07:00
cbd79c666e clientv3: keepaliveonce should have a per call ctx
KeepAliveOnce should have a per call ctx. Now we have a per
API ctx, but we might do rpc calls mutiple times in a for loop.

To avoid unnecessary routine leak, use per call ctx.
2016-04-25 10:46:47 -07:00
1b98074897 docs: move clustering doc 2016-04-25 10:35:29 -07:00
1378e72bc2 Merge pull request #5184 from gyuho/typo
*: fix flag location, minor typo
2016-04-25 09:59:24 -07:00
3ae956eb89 Merge pull request #5179 from xiang90/doc_di
doc: link to recovery.md
2016-04-25 09:47:30 -07:00
3ad8e91e00 *: fix flag location, minor typo 2016-04-25 09:41:11 -07:00
663aca701d Merge pull request #5177 from xiang90/lease_client_2
clientv3: retry on switchRemoteAndStream
2016-04-25 09:36:06 -07:00
736e1d6c33 doc: link to recovery.md 2016-04-23 22:40:35 -07:00
844208d7dd clientv3: retry on switchRemoteAndStream
If switchRemoteAndStream fails, the whole lease API fails since
the internal routine exits. We should only fail the whole API when
there is a fatal error. For example, we should fail if we fail to
connection to all the endpoints user provided.

If we connect to an endpoint, but fail to create a stream, we should
retry instead of returning error to fail the entire API.
2016-04-23 21:55:34 -07:00
f8673b5f60 Merge pull request #5170 from gyuho/tester
etcd-tester: flag consistency-check
2016-04-22 22:26:17 -07:00
151d0d3831 etcd-tester: flag consistency-check 2016-04-22 22:22:12 -07:00
90f91ac8ac Merge pull request #5162 from heyitsanthony/disaster-doc
doc: v3 disaster recovery doc
2016-04-22 19:48:45 -07:00
579c1342e6 doc: v3 disaster recovery doc 2016-04-22 19:49:39 -07:00
50471d0c5c Merge pull request #5168 from heyitsanthony/fix-pipeline-leak
etcdserver: stop raft after stopping apply scheduler
2016-04-22 19:11:34 -07:00
08d879341d etcdserver: stop raft after stopping apply scheduler
Was causing a pipeline leak.
2016-04-22 17:15:13 -07:00
e51e146a19 Merge pull request #5167 from xiang90/doc_reorg
docs: update docs.md and create subdirs
2016-04-22 17:03:16 -07:00
bfd6465ea3 docs: update docs.md and create subdirs 2016-04-22 16:58:03 -07:00
45bf7fb960 Merge pull request #5165 from xiang90/race
raft: fix detected race in node.go
2016-04-22 16:12:33 -07:00
59c5110b73 raft: fix detected race in node.go 2016-04-22 15:45:33 -07:00
0dd9c2520b Merge pull request #5164 from gyuho/sleep_for_slow_network
etcd-tester: wait more for slow network recovery
2016-04-22 15:36:50 -07:00
6a0664d701 etcd-tester: wait more for slow network recovery
For https://github.com/coreos/etcd/issues/5121.
2016-04-22 15:24:47 -07:00
da1138f8de Merge pull request #5160 from gyuho/close_db
etcdctl/ctlv3: close bolt.DB in snapshot status
2016-04-22 12:10:28 -07:00
d49c044666 Merge pull request #5135 from heyitsanthony/maintenance-doc
doc: v3 maintenance
2016-04-22 12:02:39 -07:00
53abaf86c6 etcdctl/ctlv3: close bolt.DB in snapshot status 2016-04-22 11:43:52 -07:00
3ffbc4c8dd Merge pull request #5158 from gyuho/functional-test-check
etcd-tester: reset success var for every case
2016-04-22 09:37:27 -07:00
0feb88cee1 etcd-tester: change var success->failed
Previous success overwrites the later failure.
Make it simpler by changing the variable to 'failed'.
2016-04-22 09:27:37 -07:00
af30795752 Merge pull request #5157 from mitake/5155
etcdserver: remove a data race of ServerStat
2016-04-22 09:19:47 -07:00
24077fb3f6 etcdserver: remove a data race of ServerStat
It seems that ServerStats.BecomeLeader() is missing a lock.

Fix https://github.com/coreos/etcd/issues/5155
2016-04-22 23:41:38 +09:00
69bc0f76bc Merge pull request #5152 from heyitsanthony/fix-quota-test
integration: wait for alarm in TestV3StorageQuotaApply
2016-04-21 21:26:46 -07:00
2927c90fae integration: wait for alarm in TestV3StorageQuotaApply
Fixes #4974
2016-04-21 20:53:43 -07:00
f73cdf4035 Merge pull request #5153 from gyuho/api_doc
*: change Protocol Buffer documentation title
2016-04-21 20:03:37 -07:00
2751a10db6 *: change Protocol Buffer documentation title 2016-04-21 19:58:41 -07:00
fdf6335416 Merge pull request #5117 from gyuho/proto_gen
*: Protocol Buffer docs auto-generate script
2016-04-21 19:33:41 -07:00
753630dc37 *: Protocol Buffer docs auto-generate script 2016-04-21 19:14:21 -07:00
b8c35e3af8 doc: v3 maintenance 2016-04-21 17:02:46 -07:00
d32113a0e5 Merge pull request #5150 from xiang90/doc_f
doc: front page of etcd3 doc
2016-04-21 16:49:18 -07:00
e38710b5f9 doc: front page of etcd3 doc 2016-04-21 16:42:16 -07:00
0c191b71ec Merge pull request #5146 from gyuho/help
etcdmain: quota-backend-bytes in help.go
2016-04-21 13:24:20 -07:00
fa61bf86d7 etcdmain: add quota-backend-bytes to help.go 2016-04-21 13:05:54 -07:00
79a91b3450 Merge pull request #5145 from gyuho/skip_compact
etcd-tester: skip compaction after different hash
2016-04-21 11:09:19 -07:00
4e175a98c3 Merge pull request #5144 from xiang90/l
*: fix invalid access to backend struct
2016-04-21 10:14:39 -07:00
c0cf44f134 backedn: protect backend access with lock 2016-04-21 09:34:31 -07:00
4991cda202 etcdsever: fix the leaky snashot routine issue 2016-04-21 08:48:11 -07:00
8684d96914 Merge pull request #5124 from mitake/auth-v3-authenticate
*: support authenticate in v3 auth
2016-04-20 21:07:09 -07:00
131e3806bb *: support authenticate in v3 auth
This commit implements Authenticate() API of the auth package. It does
authentication based on its authUsers bucket and generate a token for
succeeding RPCs.
2016-04-21 12:32:19 +09:00
e835d24bea etcd-tester: skip compaction after different hash
When hashes don't match, there could be some nodes
falling behind and the compact request can then error
with 'future revision compact'.
2016-04-20 17:13:51 -07:00
05d5459b1d Merge pull request #5143 from gyuho/mirror-make-e2e
e2e: make-mirror
2016-04-20 15:33:41 -07:00
6eb25751ec e2e: make-mirror 2016-04-20 15:13:45 -07:00
29dfca883f Merge pull request #5141 from gyuho/alarm_test
e2e: test alarm
2016-04-20 12:09:43 -07:00
d976121e35 e2e: test alarm 2016-04-20 11:38:53 -07:00
20db51bfb2 Merge pull request #5138 from heyitsanthony/v2api-refactor
etcdserver: v2api refactor
2016-04-20 11:07:37 -07:00
b37a0ad9e7 Merge pull request #5137 from gyuho/member_add_test
e2e: add member add/update test
2016-04-20 10:38:43 -07:00
f1440f1d63 Merge pull request #5140 from xiang90/fix_d
backend: update db.size after defrag
2016-04-20 10:31:37 -07:00
0fe24e7ffc etcdserver: rename v3demo_server to v3_server
Not much of a demo any more.
2016-04-20 10:29:22 -07:00
ebace2eb1b etcdserver: split out v2 Do() API from core server code 2016-04-20 10:29:22 -07:00
41382bc3f0 etcdserver: split out v2 raft apply interface 2016-04-20 10:29:22 -07:00
1fe4c34398 Merge pull request #5131 from heyitsanthony/etcdctl-get-json
etcdctl: print full json response for Get
2016-04-20 10:21:48 -07:00
0893dbf7c1 e2e: add member add/update test 2016-04-20 10:05:55 -07:00
bfc6309222 Merge pull request #5129 from xiang90/pipe_test
make TestPipelineKeepSendingWhenPostError reliable
2016-04-20 10:02:17 -07:00
74d50884bb backend: update db.size after defrag 2016-04-20 10:01:38 -07:00
d2a58cbb0a etcdctl: print full json response for Get
Otherwise parsing get/txn output with json is somewhat complicated
because in some cases there's a json message and sometimes not.
Likewise, a get on an absent key has to return the current revision for
some algorithms to work.
2016-04-20 09:56:32 -07:00
fb137f11c5 rafthttp: make TestPipelineKeepSendingWhenPostError reliable 2016-04-20 09:38:47 -07:00
0c40f4a7e3 Merge pull request #5136 from heyitsanthony/test-display-gosimple
test: display failure output for gosimple
2016-04-19 23:25:12 -07:00
46dfa682e7 test: display failure output for gosimple 2016-04-19 22:58:37 -07:00
32a486b462 Merge pull request #5127 from xiang90/down_build
doc: build
2016-04-19 13:38:04 -07:00
d1067d39c7 doc: build 2016-04-19 13:37:50 -07:00
8af9c88377 Merge pull request #5122 from xiang90/lease_doc
doc: add lease section to interacting doc
2016-04-19 13:31:16 -07:00
16630529f7 Merge pull request #5125 from xiang90/dev_cluster
doc: add local_cluster doc
2016-04-19 10:51:16 -07:00
531ee93878 doc: add local_cluster doc 2016-04-19 10:50:54 -07:00
6d06c060b4 doc: add lease section to interacting doc 2016-04-19 08:18:59 -07:00
668ea89980 Merge pull request #5126 from judwhite/patch-2
raft/doc.go: add missing }
2016-04-19 07:25:31 -07:00
a9cfbd5414 raft/doc.go: add missing } 2016-04-19 04:21:33 -05:00
bf9cccfc34 Merge pull request #5118 from ajityagaty/fsync_osx
fileutil: Sync on HFS/OSX needs to be handled differently.
2016-04-18 22:22:53 -07:00
8b6de5f85d fileutil: Sync on HFS/OSX needs to be handled differently.
A call file.Sync on OSX doesn't guarantee actual persistence on
physical drive media as the data can be cached in physical drive's
buffers. Hence calls to file.Sync need to be replaced with
fcntl(F_FULLFSYNC).
2016-04-18 21:49:04 -07:00
d16628bf50 Merge pull request #5120 from magicwang-cn/master
etcdserver: close response body when getting cluster information
2016-04-18 19:44:19 -07:00
97c71f44fd etcdserver: close response body when getting cluster information 2016-04-19 10:03:40 +08:00
c4892c7f51 Merge pull request #5105 from xiang90/get_started
doc: add write/read example for interact doc
2016-04-18 14:27:53 -07:00
a2ac639176 doc: add write/read example for interact doc 2016-04-18 13:42:12 -07:00
8a0fa5622e Merge pull request #5114 from gyuho/snapshot_test
*: add Snapshot e2e test
2016-04-18 09:27:07 -07:00
b494ad3a0d Merge pull request #5112 from heyitsanthony/protobuf-comments
storagepb, etcdserverpb: improve documentation for RPC message fields
2016-04-17 23:53:53 -07:00
42245a5518 storagepb, etcdserverpb: improve documentation for RPC message fields 2016-04-17 23:33:00 -07:00
ea6a747fc1 Merge pull request #5116 from ajityagaty/typo_fix
etcdctlv3: Fix for typo in alarm command handling.
2016-04-17 20:30:22 -07:00
68dd22d93d etcdctlv3: Fix for typo in alarm command handling. 2016-04-17 19:31:39 -07:00
9504df2917 Merge pull request #5115 from gyuho/gc
v3rpc: bytes-key map look-up gc optimization
2016-04-17 13:21:47 -07:00
86f580fa8f v3rpc: bytes-key map look-up gc optimization
This change
f5f5a8b620
just got merged to go1.6.1 where Go does special optimization for x =
m[string(k)] where k is []byte.
2016-04-17 10:52:19 -07:00
a2afb513dd *: add snapshot e2e test 2016-04-16 13:27:10 -07:00
d4ff9364d4 Merge pull request #4861 from heyitsanthony/nfs-lock
pkg/fileutil: fix linux file locks over NFS
2016-04-16 08:59:10 -07:00
11e8d01035 Merge pull request #5113 from ajityagaty/remove_lease_id_casts
clientv3: Remove superfluous LeaseID casts in integration tests.
2016-04-16 07:22:06 -07:00
f15b5aa4e6 Merge pull request #5034 from ZhuPeng/proxy-http2
Enable http2 support between proxy and member
2016-04-16 07:04:41 -07:00
da5bd04a1a clientv3: Remove superflous LeaseID casts in integration tests.
The integration tests under clientv3 have superflous LeaseID casts
that are not needed as the ID field of the lease responses are of
type LeaseID now.
2016-04-15 17:48:20 -07:00
73b48dd8eb Merge pull request #5111 from Amit-PivotalLabs/fix-etcdctl-unset-env
etcdctl: unset ETCDCTL_API env var properly
2016-04-15 16:32:42 -07:00
c629a30f1f etcdctl: unset ETCDCTL_API env var properly 2016-04-15 15:43:00 -07:00
4ed5f66a7a Merge pull request #5109 from gyuho/member_remove_test
e2e: add member remove test
2016-04-15 15:04:00 -07:00
caf0e9b9b1 Merge pull request #5110 from gyuho/error_when_db_not_exist
etcdctl: snapshot status error for non-existent file
2016-04-15 14:44:25 -07:00
59a88d1cf6 e2e: add member remove test 2016-04-15 14:43:32 -07:00
a78ece4ac2 etcdctl: snapshot status error for non-existent file 2016-04-15 14:15:16 -07:00
3ee99a496f Merge pull request #5096 from heyitsanthony/clientv3-run-examples
test, clientv3: run examples as integration tests
2016-04-15 12:42:44 -07:00
9bfa0172f5 test, clientv3: run examples as integration tests 2016-04-15 11:51:30 -07:00
d4dae7e9e9 Merge pull request #5101 from gyuho/watch_bench_fix
benchmark: ensure all watcher receivers to finish
2016-04-15 11:49:24 -07:00
ad226f2020 benchmark: ensure all watcher receivers to finish
Fix https://github.com/coreos/etcd/issues/5099.
2016-04-15 11:11:14 -07:00
c1455a4f10 Merge pull request #5090 from ajityagaty/lease_id
clientv3: Use LeaseID in all the client APIs.
2016-04-15 10:48:29 -07:00
da153d3f3c Merge pull request #5091 from xiang90/r_h
doc: add response header doc into api
2016-04-15 09:57:48 -07:00
3b72c3da53 doc: add response header doc into api 2016-04-15 09:54:30 -07:00
81a5fc16ef Merge pull request #5095 from gyuho/govet_fix
*: fix govet -shadow in go tip
2016-04-15 09:41:24 -07:00
376234f196 Merge pull request #5094 from gyuho/watch_range_example
*: add more examples to clientv3, pkg/adt
2016-04-15 09:10:25 -07:00
641a1a66e1 *: fix govet -shadow in go tip 2016-04-15 07:39:52 -07:00
ae27b991b1 *: add more examples to clientv3, pkg/adt 2016-04-14 23:46:50 -07:00
06a4086bf9 clientv3: Use LeaseID in all the client APIs.
In order to use LeaseID type instead of int64 we have to convert
the protobuf lease responses into client lease reponses.
2016-04-14 23:09:46 -07:00
4ee7cad116 Merge pull request #5093 from gyuho/fix_test
functional-tester/etcd-tester: fix error check
2016-04-14 21:45:44 -07:00
8515ae30fb functional-tester/etcd-tester: fix error check 2016-04-14 21:31:12 -07:00
67db28f979 proxy: enable http2 for connecting to members
enable http2 when transport specified a custom TLS config, which was
not automatically enable.

Issue 5033
2016-04-15 10:16:26 +08:00
6c1cc1d4ea Merge pull request #5089 from heyitsanthony/fix-func-tester-timeout
etcd-tester: return error if first compaction times out
2016-04-14 17:24:22 -07:00
21233416e8 etcd-tester: return error if first compaction times out
Fixes #5081
2016-04-14 17:11:53 -07:00
74153ffa45 Merge pull request #5082 from xiang90/kv_d
doc: add doc for kv message
2016-04-14 15:17:04 -07:00
df37c75bb9 doc: add doc for kv message 2016-04-14 15:16:23 -07:00
f2e915f56e Merge pull request #5086 from heyitsanthony/test-race-rafthttp
test: check races on rafthttp
2016-04-14 14:21:20 -07:00
57448622d9 Merge pull request #5085 from heyitsanthony/hide-yaml
clientv3: make YamlConfig struct private
2016-04-14 14:10:20 -07:00
01be6933c6 test: check races on rafthttp
The data race in net/http has been fixed for a while.
2016-04-14 13:45:31 -07:00
cfbb8a71db Merge pull request #5084 from gyuho/typo
clientv3: fix example code format, more examples
2016-04-14 12:30:44 -07:00
04ef861c3d clientv3: make YamlConfig struct private 2016-04-14 12:26:01 -07:00
81e344bef9 clientv3: fix example code format, more examples 2016-04-14 12:13:07 -07:00
6bbdebb281 Merge pull request #5076 from gyuho/more_e2e
*: add, clean up e2e tests
2016-04-14 11:59:13 -07:00
6a3b5fe70c Merge pull request #5083 from ajityagaty/role_grant_test
e2e: Test case for the etcdctlv3 'role grant' command.
2016-04-14 11:53:21 -07:00
fefb58dc90 e2e: clean up, add more tests 2016-04-14 11:42:57 -07:00
4495559ad6 e2e: Test case for the etcdctlv3 'role grant' command.
Adding a test case to test the 'role grant' sub-command.
2016-04-14 11:31:07 -07:00
ba1c0a2b12 Merge pull request #5080 from xiang90/up
proxy: initial userspace tcp proxy
2016-04-14 10:41:46 -07:00
4a913ae60a proxy: initial userspace tcp proxy 2016-04-14 10:14:30 -07:00
da1132662a Merge pull request #5078 from ajityagaty/role_cmd_tests
e2e: Test case for the etcdctlv3 role command.
2016-04-14 09:44:07 -07:00
27844a6aef Merge pull request #5079 from mitake/auth-fix
auth: remove index out of range in role grant
2016-04-14 08:07:18 -07:00
a016220648 auth: remove index out of range in role grant
Fixes https://github.com/coreos/etcd/issues/5077
2016-04-14 22:02:10 +09:00
3b7c8d752c e2e: Test case for the etcdctlv3 role command.
New test cases have been added to test the 'role' and 'user'
sub-commands of etcdctlv3 utility.
2016-04-14 01:54:22 -07:00
ac95cc32ef Merge pull request #5075 from xiang90/p
proxy: move http related thing to httpproxy
2016-04-13 22:44:29 -07:00
e913792d0f Merge pull request #5073 from heyitsanthony/etcdctl-docs
doc: document many etcdctl commands
2016-04-13 22:08:22 -07:00
cd05ac4217 doc: document many etcdctl commands
documents defrag, compaction, lease, snapshot status, member, endpoint
2016-04-13 21:50:59 -07:00
b20d171ee1 Merge pull request #5074 from heyitsanthony/fix-compact-current-rev
storage: have Range on rev=0 work even if compacted to current revision
2016-04-13 21:15:55 -07:00
66d2ae7a39 proxy: move http related thing to httpproxy 2016-04-13 21:09:26 -07:00
d72bcdc156 storage: have Range on rev=0 work even if compacted to current revision 2016-04-13 21:00:35 -07:00
e6ff5a38e1 Merge pull request #5072 from heyitsanthony/fix-ep-json
etcdctl: respect --write-out=json for endpoint status command
2016-04-13 19:12:26 -07:00
793fb2cf64 Merge pull request #4673 from gyuho/slow
functional-tester: add latency test (simulate slow network)
2016-04-13 17:07:30 -07:00
f07350735d etcdctl: respect --write-out=json for endpoint status command 2016-04-13 17:04:31 -07:00
6af40ea1e1 functional-tester: add latency test (simulate slow network)
Fix https://github.com/coreos/etcd/issues/4666.
2016-04-13 17:00:09 -07:00
e9aa8ff235 Merge pull request #5071 from gyuho/member_api_change
*: Member api change
2016-04-13 16:45:10 -07:00
3dcfe79cc0 Merge pull request #5070 from heyitsanthony/member-doc
etcdctl: display required arguments for member commands in usage
2016-04-13 16:40:16 -07:00
7a2ef3eb00 *: regenerate proto buffers 2016-04-13 16:24:07 -07:00
2c6176b5f2 *: remove MemberLeader API in client side (fix examples) 2016-04-13 16:23:57 -07:00
b78886239e *: remove IsLeader field in Member API server side 2016-04-13 16:23:33 -07:00
90df7fd738 etcdctl: display required arguments for member commands in usage 2016-04-13 16:18:00 -07:00
22812badc2 Merge pull request #5069 from heyitsanthony/fix-snapshot-status-json
etcdctl: respect -write-out=json for snapshot status
2016-04-13 15:57:39 -07:00
b90e30b28e etcdctl: respect -write-out=json for snapshot status 2016-04-13 13:37:32 -07:00
a553ea8ba7 Merge pull request #5068 from heyitsanthony/lease-fixups
etcdctl: improve lease command documentation and exit codes
2016-04-13 13:20:06 -07:00
993f25f055 Merge pull request #5065 from heyitsanthony/errexit-defrag
etcdctl: return non-zero exit code if defrag fails on any endpoint
2016-04-13 13:19:43 -07:00
721ed6ba2b etcdctl: return non-zero exit code if defrag fails on any endpoint 2016-04-13 12:39:43 -07:00
855a5116a2 etcdctl: improve lease command documentation and exit codes 2016-04-13 12:38:21 -07:00
c0971a6ebc Merge pull request #5066 from gyuho/compaction_test
e2e: compaction test
2016-04-13 12:35:20 -07:00
3f0863a1e9 e2e: compact test 2016-04-13 12:07:48 -07:00
c8e860c4fa Merge pull request #5055 from gyuho/get_rev
*: add rev flag to get command
2016-04-13 12:05:48 -07:00
3fef0eb0d8 Merge pull request #5061 from xiang90/grpc_d
*:update dependencies
2016-04-13 11:40:14 -07:00
5157b713ed Merge pull request #5064 from raoofm/patch-1
Documentation: v3 mem benchmark total watch value
2016-04-13 11:35:23 -07:00
60548b85c4 *: add rev flag to get command 2016-04-13 11:32:29 -07:00
15e865e024 Merge pull request #5062 from gyuho/govet-mutex
etcd-tester: fix govet
2016-04-13 11:19:20 -07:00
cb280bae91 etcd-tester: fix govet 2016-04-13 11:12:31 -07:00
61cfe68247 Documentation: v3 mem benchmark total watch value
Updating Documentation/benchmarks/etcd-3-watch-memory-benchmark.md with the correct 'total watching' value
2016-04-13 14:12:10 -04:00
52c4595899 Merge pull request #5060 from gyuho/ineffassign
*: fixes based on ineffassign
2016-04-13 10:59:58 -07:00
7c5ec417c3 *:update dependencies 2016-04-13 10:47:24 -07:00
89f8e66682 *: fixes based on ineffassign 2016-04-13 10:41:58 -07:00
35d2d7b23e Merge pull request #5059 from gyuho/elect_e2e_test
e2e: add elect command test
2016-04-13 10:25:28 -07:00
1224044553 e2e: add elect command test 2016-04-13 10:00:56 -07:00
228e772b3a Merge pull request #5056 from heyitsanthony/expect-signal
pkg/expect, e2e: support sending Signals to expect process, test etcdctl lock
2016-04-13 09:42:41 -07:00
8763bd1e97 e2e: etcdctlv3 lock test 2016-04-13 09:26:16 -07:00
604a73c833 e2e: remove sh in spawnCmd
certain shells claim the ppid for expect processes which interferes with
signals
2016-04-13 09:12:40 -07:00
fcb5ba98d0 pkg/expect: support sending Signals to expect process 2016-04-13 09:11:57 -07:00
18992bac4f Merge pull request #5057 from heyitsanthony/e2e-v3-cleanup
e2e: cleanup error and prefix arg handling for ctlv3 tests
2016-04-13 09:09:13 -07:00
209f573083 e2e: cleanup error and prefix arg handling for ctlv3 tests 2016-04-12 23:48:13 -07:00
2985396768 Merge pull request #5053 from xiang90/ctl_i
etcdctl: move endpoint-heath and status into endpoint command
2016-04-12 16:50:03 -07:00
ae9b251d99 etcdctl: move endpoint-heath and status into endpoint command 2016-04-12 16:30:26 -07:00
0ca949ce90 Merge pull request #5051 from heyitsanthony/fix-user-list
etcdctl: don't panic on ListUser with roles
2016-04-12 14:24:08 -07:00
c9ce92f635 client: accept roles in response for ListUser
Fixes #5046
2016-04-12 12:48:43 -07:00
a8b7d0b63c Merge pull request #5050 from xiang90/b_v
etcdserver: save cluster version into backend
2016-04-12 12:05:02 -07:00
e9735b7bd0 etcdserver: save cluster version into backend 2016-04-12 11:37:22 -07:00
f13e558ab4 e2e: test etcdtl user list on root user 2016-04-12 11:15:06 -07:00
095a755e4d Merge pull request #5049 from heyitsanthony/fix-grant-roles-existing
etcdctl: don't crash on duplicate role in user grant
2016-04-12 11:05:08 -07:00
a12fd9cc92 etcdctl: print grant/revoke error instead of scanning roles for changes
Fixes #5045
2016-04-12 10:49:05 -07:00
a0d653b630 e2e: test etcdctl v2 double user grant
Crashes in 2.3.1
2016-04-12 10:49:05 -07:00
040f7b90c7 Merge pull request #5048 from xiang90/fix_c
rafthttp: fix comment in msgappv2
2016-04-12 10:20:04 -07:00
f2d558644d rafthttp: fix comment in msgappv2 2016-04-12 10:14:06 -07:00
ef0d5c3d7d Merge pull request #4957 from mqliang/memberStatus
etcdctlv3: expose store size and raft status in 'etcdctl status' command
2016-04-12 08:46:45 -07:00
ff311ba0a7 etcdctlv3: print db size and raft status in 'etcdctl status' command 2016-04-12 22:58:22 +08:00
a9a06438f9 etcdctlv3: expose db size and raft status in server side 2016-04-12 22:49:15 +08:00
1044fbce2c etcdctlv3: update aunto generated files 2016-04-12 22:48:47 +08:00
c3da2631bf etcdctlv3: add db size and raft status in protobuffer 2016-04-12 22:47:27 +08:00
17e32b6aa9 Merge pull request #5041 from xiang90/snap_info
snapshot status
2016-04-11 23:13:08 -07:00
50219d4def Merge pull request #5042 from gyuho/ts
benchmark: return time series with missing periods filled in
2016-04-11 23:09:36 -07:00
8e3d99cd3e Merge pull request #5043 from mitake/auth-trivial
little cleaning of v3 auth
2016-04-11 23:09:22 -07:00
2aab6ff2eb benchmark: return time series with missing periods filled in 2016-04-11 23:07:45 -07:00
94d436c5d1 vendor: add go-humanize 2016-04-11 22:55:47 -07:00
b5292f6fce etcdctl: add snapshot status support 2016-04-11 22:55:47 -07:00
0b4749ea65 auth: remove needless logging during creating a new user 2016-04-12 14:52:31 +09:00
bfd49023a1 auth: sort key permissions of role struct for effective searching 2016-04-12 14:52:31 +09:00
b1c3e7edbf Merge pull request #4982 from heyitsanthony/godep-update-script
scripts: updatedep.sh to update vendored dependencies
2016-04-11 19:47:18 -07:00
4481e54017 Merge pull request #5040 from heyitsanthony/fix-txn-rev
etcdserver: set txn header revision to store revision following txn
2016-04-11 19:41:54 -07:00
c5b8e8dc88 etcdserver: set txn header revision to store revision following txn 2016-04-11 17:03:05 -07:00
8c2225f251 Merge pull request #5038 from heyitsanthony/sshot-docs
doc: document etcdctl snapshot command
2016-04-11 16:21:09 -07:00
195100a769 Merge pull request #5039 from heyitsanthony/fix-write-out
etcdctl: respect --write-out
2016-04-11 16:19:43 -07:00
3a695a82a3 Merge pull request #5036 from xiang90/r_t
raft: add a test case for Test Slice
2016-04-11 16:02:13 -07:00
e5a2bd58ec etcdctl: respect --write-out
Support got clobbered about a month ago.
2016-04-11 16:01:38 -07:00
6e8d01f956 doc: document etcdctl snapshot command 2016-04-11 15:58:20 -07:00
0a684c10ad Merge pull request #5025 from xiang90/no_dup_resp
etcdserver: do not send out out of date appResp
2016-04-11 14:41:52 -07:00
3bad47d691 Merge pull request #5018 from xiang90/b
etcdserver: set backend to cluster
2016-04-11 13:02:57 -07:00
be822b05d2 Merge pull request #5012 from heyitsanthony/snap-api
*: snapshot RPC
2016-04-11 13:00:18 -07:00
e838c26f8a etcdctl: use snapshot RPC in snapshot command 2016-04-11 12:32:53 -07:00
b97b5843a3 Merge pull request #5035 from heyitsanthony/fix-unused-output
test: display unused output if unused source found
2016-04-11 11:36:23 -07:00
174a996c37 Merge pull request #5032 from mitake/auth-user-grant
*: support granting a role to a user in v3 auth
2016-04-11 11:10:10 -07:00
9423125ce1 raft: add a test case for Test Slice 2016-04-11 10:04:03 -07:00
2113b77635 test: display unused output if unused source found
unused will non-zero exit if it finds unused source which causes test's
set -e to close out of the test script
2016-04-11 09:55:22 -07:00
d5766eab3e clientv3: add Snapshot to Maintenance 2016-04-11 09:51:17 -07:00
a6b6fcf1c4 etcdserverpb, v3rpc: add Snapshot to Maintenance RPC service 2016-04-11 09:51:16 -07:00
7ba2646d37 *: support granting a role to a user in v3 auth 2016-04-11 15:53:30 +09:00
af1b3f061a Merge pull request #5024 from ajityagaty/user_cmd_tests
e2e: Test cases for the etcdctlv3 user commands.
2016-04-10 23:50:12 -07:00
6c8428c393 Merge pull request #5031 from gyuho/cleanup
*: clean up from go vet, misspell
2016-04-10 23:32:46 -07:00
9108af9046 *: clean up from go vet, misspell 2016-04-10 23:16:56 -07:00
1f4f3667a4 Merge pull request #5021 from gyuho/vendor_doc
*: client vendoring README
2016-04-10 22:15:36 -07:00
935999a80e Merge pull request #5030 from mitake/auth-trivial
trivial updates for v3 auth
2016-04-10 21:47:08 -07:00
53bb79f240 auth: remove needless field from protobuf define
The field tombstone won't be used in the future because of the design
change.
2016-04-11 13:02:34 +09:00
097cec8194 etcdctl: let some v3 auth related functions be private
They don't need to be public.
2016-04-11 13:01:19 +09:00
27480f9ea4 Merge pull request #4966 from mitake/auth-role-grant
*: support granting key permission to role in v3 auth
2016-04-10 20:31:05 -07:00
02033b4c47 *: support granting key permission to role in v3 auth 2016-04-11 12:23:19 +09:00
f5f0280a63 Merge pull request #5028 from heyitsanthony/etcdmain-unsupported-envvar
etcdmain: start on unsupported arch when ETCD_UNSUPPORTED_ARCH is set
2016-04-10 19:55:34 -07:00
c4caa65c51 etcdmain: start on unsupported arch when ETCD_UNSUPPORTED_ARCH is set 2016-04-10 19:36:04 -07:00
130567832f Merge pull request #4734 from luxas/32bit_alignments
etcdserver: align 64-bit atomics on 8-byte boundary
2016-04-10 19:18:15 -07:00
603c14db9d e2e: Test cases for the etcdctlv3 user commands.
New test cases have been added to test the "user" sub-commands of
the etcdctlv3 utility.
2016-04-10 17:46:04 -07:00
0d9039f192 Merge pull request #5026 from lodevil/master
KeepAliveOnce error fix (when the lease not found)
2016-04-10 17:35:52 -07:00
e3fd246414 clientv3: fix KeepAliveOnce return error message 2016-04-11 08:13:36 +08:00
de7692b2b2 etcdserver: do not send out out of date appResp 2016-04-09 23:30:00 -07:00
3c0ac9d600 etcdserver: set backend to cluster 2016-04-08 21:46:45 -07:00
78554c6de6 *: client vendoring README 2016-04-08 19:48:17 -07:00
345bdc3db6 Merge pull request #5017 from xiang90/member
membership: save/update the whole member information into backend
2016-04-08 13:30:35 -07:00
a406c9fa3d membership: save/update the whole member information into backend 2016-04-08 13:14:37 -07:00
fe810e7b43 Merge pull request #5015 from gyuho/semaphore-ci-badge
README: change semaphore CI status badge
2016-04-08 12:10:05 -07:00
9bb99b5f72 Merge pull request #5016 from gyuho/lease_simple
*: clean up from gosimple
2016-04-08 12:09:56 -07:00
953a08d841 *: clean up from gosimple 2016-04-08 11:55:03 -07:00
4997ed36b4 Merge pull request #5011 from xiang90/r_c
raft: fix issues reported by golint
2016-04-08 11:46:12 -07:00
97730778e5 README: change semaphore CI status badge
We just reset Semaphore CI, and badge URL is changed.
2016-04-08 11:24:30 -07:00
b70e6a6bf1 Merge pull request #4916 from es-chow/transfer-leader
raft: transfer leader feature
2016-04-08 08:01:24 -07:00
ac059eb8cb raft: transfer leader feature 2016-04-08 16:56:32 +08:00
4041bbe571 Merge pull request #5008 from gyuho/gosimple_unused
clean up with gosimple and unused
2016-04-07 23:31:21 -07:00
fb85da92e8 *: fix based on gosimple and unused 2016-04-07 23:16:37 -07:00
9aec045fce test, travis: integrate gosimple and unused 2016-04-07 23:16:33 -07:00
1b41ee9c99 raft: fix issues reported by golint 2016-04-07 22:14:56 -07:00
49f9b5470e Merge pull request #5009 from xiang90/sp
*: fix misspell
2016-04-07 22:09:41 -07:00
9c7fb9c360 *: fix misspell 2016-04-07 21:57:06 -07:00
71a492e59e Merge pull request #5005 from xiang90/clu_storage
membership: update attr in membership pkg
2016-04-07 21:40:32 -07:00
b13b77f362 membership: update attr in membership pkg 2016-04-07 21:25:32 -07:00
2fe3e1e850 Merge pull request #5007 from heyitsanthony/hush-caps
v2http: only report capabilities on update
2016-04-07 20:31:37 -07:00
2b7ad35fa0 v2http: only report capabilities on update 2016-04-07 20:14:30 -07:00
1f5794c117 Merge pull request #4997 from heyitsanthony/fix-race-consistent
etcdserver: fix race on consistent index
2016-04-07 20:12:22 -07:00
4d2d2cabb9 etcdserver: fix race on consistent index 2016-04-07 19:53:08 -07:00
004ff3d4f0 Merge pull request #5006 from gyuho/watch_type
clientv3/integration: use clientv3.Event type
2016-04-07 19:48:18 -07:00
a9f1d5dfa6 clientv3/integration: use clientv3 event types
Fix https://github.com/coreos/etcd/issues/5001.
2016-04-07 19:29:32 -07:00
8b320e7c55 Merge pull request #4999 from gyuho/test
*: log, expect by capability check
2016-04-07 19:06:15 -07:00
1c12b66e35 Merge pull request #5000 from xiang90/clu_storage
membership: save/update/delete member when backend is provided
2016-04-07 18:00:11 -07:00
868a3e279d Merge pull request #5002 from gyuho/agent_test
etcd-agent: fix etcd agent tests, remove unused listener
2016-04-07 17:19:44 -07:00
d78345244b *: log, expect by capability check 2016-04-07 17:18:51 -07:00
139f23fd13 etcd-agent: fix etcd agent tests, remove unused listener 2016-04-07 17:04:24 -07:00
29623cccb2 membership: save/update/delete member when backend is provided 2016-04-07 16:34:43 -07:00
c91c7ca3bf Merge pull request #4961 from heyitsanthony/rename-lease-create
*: rename lease Create to Grant
2016-04-07 14:51:22 -07:00
f31105bc08 Merge pull request #4994 from xiang90/clu
etcdserver: move membership related code to membership pkg
2016-04-07 14:39:18 -07:00
bf2289ae00 etcdserver: move membership related code to membership pkg 2016-04-07 14:21:37 -07:00
5d4ee7ac5f Merge pull request #4995 from gyuho/proxy-clean
proxy: simplify channel receive, add missing function call
2016-04-07 12:32:24 -07:00
dc17eaace7 *: rename Lease Create to Grant
Creating a lease through the client API interface union looked like
"c.Create(...)"-- the method name wasn't very descriptive.
2016-04-07 12:28:14 -07:00
6abbdcdc06 proxy: simplify channel receive, add missing function call 2016-04-07 12:24:17 -07:00
ee4ff1e448 Merge pull request #4976 from gyuho/lease_testing
e2e: lease tests, fix minor format string
2016-04-07 11:25:51 -07:00
84bf6e7462 e2e: lease tests, fix minor format string 2016-04-07 11:18:49 -07:00
a38617d93a Merge pull request #4992 from gyuho/e2e_clean
e2e: clean up, return all lines in error
2016-04-07 10:59:08 -07:00
2779341250 e2e: clean up, return all lines in error
1. change file names
2. now if sub-command errors, the test will receive all
lines from stdout and stderr.

Expected output:

```
read /dev/ptmx: input/output error (expected key2, got ["key1\r\n" "val1\r\n" ""])
```

3. change how we check GRPC timeout (only bypass timeout error when we give 0
timeout)
2016-04-07 10:41:56 -07:00
ac232ac9a7 scripts: updatedep.sh to update vendored dependencies
Running godep in the vendored cmd directory will try to pull etcd in
as a dependency. As a fix, this script safely vendors into cmd.
2016-04-07 10:28:33 -07:00
2e5ee26300 Merge pull request #4987 from hongchaodeng/ev
expose APIs to recognize event type
2016-04-07 10:18:25 -07:00
21eda79451 Merge pull request #4991 from endocode/kayrus/faq_doc
docs: fixed markdown formatting in faq.md
2016-04-07 09:59:35 -07:00
5c782a2086 docs: fixed markdown formatting in faq.md 2016-04-07 18:51:33 +02:00
aa11dafaf8 clientv3: expose event type in user API
- add another layer of abstraction in clientv3 for user, not expose internal storagepb ones
- provide commonly used routines IsCreate(), IsModify() on event
2016-04-07 09:47:04 -07:00
a5f341e886 Merge pull request #4989 from xiang90/clu
*: move Cluster interface to api
2016-04-07 08:33:52 -07:00
030865abe3 *: move Cluster interface to api 2016-04-07 08:05:47 -07:00
b137df77f1 Merge pull request #4985 from gyuho/unused
*: clean up unused vars, functions
2016-04-06 21:49:58 -07:00
6e6d64fb9b *: clean up unused vars, functions
With help from https://github.com/dominikh/go-unused.
IsNetTimeoutError seems useful, so moved to pkg/netutil.
2016-04-06 21:33:55 -07:00
79a09e6857 Merge pull request #4984 from gyuho/watch-range
clientv3/integration: fix watch range test typo
2016-04-06 21:32:30 -07:00
e72591b4a2 clientv3/integration: fix watch range test typo 2016-04-06 21:12:07 -07:00
7408bc2504 Merge pull request #4948 from heyitsanthony/update-grpc
vendor: update grpc
2016-04-06 17:55:53 -07:00
82e58e602d Merge pull request #4983 from gyuho/expect_line
pkg/expect: ExpectFunc, LineCount
2016-04-06 16:10:02 -07:00
679e5e379b pkg/expect: ExpectFunc, LineCount
ExpectFunc to make expect more extensible. LineCount to be
able to check 'no output' command.
2016-04-06 15:56:00 -07:00
62990fb5fa Merge pull request #4970 from tamird/fix-raft-past-election
raft: correct regression in `pastElectionTimeout`
2016-04-06 08:03:38 -07:00
68db18667a raft: correct doc comment 2016-04-06 08:43:42 -04:00
5250784b09 raft: use rand.Intn instead of rand.Int and mod
This provides a better random distribution and is easier to read.
2016-04-06 08:43:42 -04:00
6b0eb9c3c0 godeps: update grpc dependency 2016-04-06 01:30:06 -07:00
34375ef851 Merge pull request #4950 from heyitsanthony/revendor
vendor: only vendor on emitted binaries
2016-04-05 21:36:51 -07:00
b1d41016b2 vendor: only vendor on emitted binaries
Moves the vendor/ directory to cmd/vendor. Vendored binaries are built
from cmd/, which is backed by symlinks pointing back to repo root.
2016-04-05 21:01:16 -07:00
b9e933b850 Merge pull request #4971 from gyuho/e2e_more
e2e: clean up to test tables, endpoint-health test
2016-04-05 14:34:35 -07:00
e3599e4145 e2e: clean up to test tables, endpoint-health test 2016-04-05 13:33:37 -07:00
01c303113d Merge pull request #4964 from gyuho/get_sort_e2e
e2e: get with sort order, target
2016-04-04 23:22:04 -07:00
3e39f36b34 e2e: get with sort order, target 2016-04-04 23:10:03 -07:00
c3bca3739f Merge pull request #4926 from mitake/auth-role-add
*: support adding role in auth v3
2016-04-04 18:44:16 -07:00
21096bf27f Merge pull request #4963 from xiang90/ht
*: mv etcdhttp into api pkg
2016-04-04 18:40:29 -07:00
8662aaada4 Merge pull request #4958 from mitake/progrep-race
etcdserver, clientv3: let progressReportIntervalMilliseconds be private
2016-04-04 18:04:57 -07:00
2b17a3919c *: support adding role in auth v3 2016-04-05 09:28:17 +09:00
88306c9fa7 etcdserver, clientv3: let progressReportIntervalMilliseconds be private
progressReportIntervalMilliseconds (old
ProgressReportIntervalMilliseconds) is accessed by multiple goroutines
and it is reported as race.

For avoiding this report, this commit wraps the variable with
functions. They access the variable with atomic operations so the race
won't be reported.
2016-04-05 09:12:17 +09:00
2c50eb240e *: mv etcdhttp into api pkg 2016-04-04 16:31:35 -07:00
bfbe0fac8c Merge pull request #4951 from gyuho/watch_prefix
e2e: watch by prefix
2016-04-04 15:11:32 -07:00
9de5b8db80 e2e: watch by prefix 2016-04-04 14:52:54 -07:00
b3247356c1 Merge pull request #4956 from heyitsanthony/txn-serialize
etcdserver: serializable transactions
2016-04-04 09:51:09 -07:00
98504fe863 Merge pull request #4959 from gyuho/ctl_doc
etcdctl: READMEv3 doc about prefix
2016-04-04 08:28:41 -07:00
1543e7bd95 etcdctl: READMEv3 doc about prefix 2016-04-04 07:00:49 -07:00
fab3c8e705 etcdserver: serializable transactions
Support case where txn doesn't have to go through quorum.
2016-04-04 04:21:42 -07:00
46e877b8bb Merge pull request #4955 from mitake/e2e-test
e2e: import fmt in etcdctlv3_test.go
2016-04-04 01:37:21 -07:00
4ff81678ac e2e: import fmt in etcdctlv3_test.go 2016-04-04 17:00:33 +09:00
b6ac21374e Merge pull request #4952 from ajityagaty/snap_db_file_fix
snap: Do not complain about db file.
2016-04-03 17:54:03 -07:00
c12f263577 snap: Do not complain about db file.
Currently the snapshotter throws a warning if a file without the
.snap suffix is found. Fix it to allow known files to exist in
the snap folder.
2016-04-03 17:28:04 -07:00
e8a4ed01e2 Merge pull request #4949 from gyuho/delete
*: add del by prefix with e2e tests
2016-04-03 12:09:16 -07:00
3abd137dc5 Merge pull request #4945 from heyitsanthony/fix-exit-status
e2e, pkg/expect: distinguish between Stop and Close
2016-04-03 12:02:59 -07:00
dc420d660e e2e, pkg/expect: distinguish between Stop and Close
Fixes #4928
2016-04-03 11:45:02 -07:00
9afae9e2c1 *: add del by prefix with e2e tests 2016-04-03 11:41:49 -07:00
bb69dd324e Merge pull request #4939 from gyuho/e2e_txn_version
e2e: etcdctlv3 version, txn basic tests
2016-04-03 11:09:57 -07:00
73b0d398e4 Merge pull request #4946 from xiang90/b
vendor: update boltdb to 1.2.0
2016-04-03 10:59:51 -07:00
f4eaa3f8fb pkg/expect: replace SendLine with Send method 2016-04-03 10:57:35 -07:00
c280871714 e2e: etcdctlv3 version, txn basic tests 2016-04-03 10:57:31 -07:00
37c1edc952 vendor: update boltdb to 1.2.0 2016-04-03 10:47:07 -07:00
19136afc2b Merge pull request #4798 from mqliang/memberStatus
etcdctlv3: initial implementaton of 'etcdctl member status' command
2016-04-03 08:48:23 -07:00
d80af00785 etcdctlv3: implement the 'etcdctl status' command 2016-04-03 13:55:58 +08:00
f3ca17ea03 etcdctlv3: implement the client side functionality 2016-04-03 13:46:34 +08:00
1d5d2494ed etcdctlv3: implement status rpc in server side 2016-04-03 13:46:01 +08:00
bbca61252f etcdctlv3: update aunto generated files 2016-04-03 13:45:17 +08:00
3c62bfb7a3 etcdctlv3: add status rpc in protbuffer file 2016-04-03 13:44:45 +08:00
6770b9c67a Merge pull request #4944 from gyuho/delete_num
etcdctl: print number of deleted keys
2016-04-02 21:13:46 -07:00
e8877ab180 etcdctl: print number of deleted keys 2016-04-02 20:54:37 -07:00
584d90cd5d Merge pull request #4912 from gyuho/defrag
functional-tester: defrag every 500 round
2016-04-02 18:58:41 -07:00
b866337f25 functional-tester: defrag every 500 round
Fix https://github.com/coreos/etcd/issues/4665.
2016-04-02 18:51:26 -07:00
d2ce6836af Merge pull request #4942 from xiang90/def
backend: reset count in defrg
2016-04-02 18:43:03 -07:00
c09f23c46d *: clean up bool comparison 2016-04-02 18:27:54 -07:00
2b54b73b90 backend: reset count in defrg 2016-04-02 17:25:05 -07:00
b0cc0e443c *: clean up if, bool comparison 2016-04-02 12:55:11 -07:00
dc0061e4db e2e: add Get tests 2016-04-01 22:45:27 -07:00
ff01a4de65 Merge pull request #4936 from heyitsanthony/compact-barrier-restore
etcdserver, storage: don't ack physical compaction on error or snap restore
2016-04-01 20:18:12 -07:00
6f707b857a etcdserver, storage: don't ack physical compaction on error or snap restore
Snapshot recovery will reset the FIFO; reschedule the physical acknowledgment
instead of acknowledging on scheduler teardown.
2016-04-01 16:32:05 -07:00
eea56d037e etcdserver: fix govet error 2016-04-01 16:01:47 -07:00
3083b6d11e Merge pull request #4933 from xiang90/m
MAINTAINERS: update maintainers list
2016-04-01 15:34:57 -07:00
623c7b4df4 Merge pull request #4930 from heyitsanthony/fix-wal-corrupt
wal: fix tail corruption
2016-04-01 15:23:52 -07:00
c0e614b0bd MAINTAINERS: update maintainers list 2016-04-01 15:12:08 -07:00
bfe3a3d08e wal: fix tail corruption
On ReadAll, WAL seeks to the end of the last record in the tail. If the tail did not
end with preallocated space, the decoder would report 0 as the last offset and begin
writing at offset 0 of the tail.

Fixes #4903
2016-04-01 15:05:52 -07:00
e1b561cb7c Merge pull request #4929 from xiang90/rand
raft: lower split vote rate
2016-04-01 12:35:59 -07:00
5d431b4782 raft: lower split vote rate 2016-04-01 12:11:03 -07:00
bf6d905a5a Merge pull request #4923 from xiang90/conf
clientv3: support read conf from file
2016-04-01 10:09:51 -07:00
f05f7b475e vendor: add yaml dependencies 2016-04-01 09:36:11 -07:00
802de5f9f8 clientv3: support read conf from file 2016-04-01 09:36:11 -07:00
307cb5167c Merge pull request #4925 from heyitsanthony/wal-dump-lock
etcd-dump-logs: don't try to acquire wal file locks
2016-03-31 22:24:54 -07:00
7fffd6ffd2 etcd-dump-logs: don't try to acquire wal file locks
can now dump logs from a running etcd instance
2016-03-31 21:51:20 -07:00
c43910f835 Merge pull request #4910 from gyuho/compact_test
etcd-tester: no error for compact double-send
2016-03-31 21:43:26 -07:00
bdaba136a9 Merge pull request #4915 from heyitsanthony/hash-barrier
etcdserver: force backend commit before acking physical compaction
2016-03-31 21:36:57 -07:00
f9b90e13ac etcd-tester: no error for compact double-send
When compactKV request is halted before final acknowledgement,
it used to just continue on the next endpoint. But there could be
a case than compactKV is requested twice, and the first one is already
replicated and applied by the time the second request was to be
applied (returning compact revision error). This skips the case
by parsing the error message.
2016-03-31 21:29:02 -07:00
81de5648d9 etcdserver: force backend commit before acking physical compaction 2016-03-31 21:25:40 -07:00
2f785015a5 Merge pull request #4922 from gyuho/ctl_test
e2e: basic v3 watch test
2016-03-31 18:18:29 -07:00
b98f67095e e2e: add basic v3 watch test 2016-03-31 18:04:14 -07:00
d898c68f2c pkg/expect: add SendLine for interactive mode 2016-03-31 15:34:30 -07:00
1d698f093f Merge pull request #4921 from xiang90/tls
*: move baisc tls util funcs to tlsutil pkg
2016-03-31 09:59:25 -07:00
eb3919e8cf *: move baisc tls util funcs to tlsutil pkg 2016-03-31 09:45:45 -07:00
de801b500b Merge pull request #4920 from mitake/auth-user-password
*: support changing password in v3 auth
2016-03-30 23:45:50 -07:00
73166b41e9 *: support changing password in v3 auth
This commit adds a functionality for updating password of existing
users.
2016-03-31 15:28:15 +09:00
f328c75ba7 Merge pull request #4919 from gyuho/expects
*: ctl v3 tests with multi expects
2016-03-30 22:23:21 -07:00
a6c6bbd81c e2e: ctl tests with multi expects 2016-03-30 22:09:23 -07:00
324afd7fde Merge pull request #4918 from mitake/auth-user-messages
etcdctl: print messages for successful auth operations
2016-03-30 22:03:14 -07:00
2ad9b5692f etcdctl: print messages for successful auth operations
This commit lets etcdctl v3 follow the manner of etcdctl v2.
2016-03-31 13:56:01 +09:00
59bb65182a Merge pull request #4917 from mitake/auth-user-delete
*: support deleting user in v3 auth
2016-03-30 21:36:17 -07:00
d8888ded12 *: support deleting user in v3 auth
This commit adds a functionality of user deletion. It can be invoked
with the new user delete command.

Example usage:
$ ETCDCTL_API=3 etcdctl user delete usr1
2016-03-31 13:18:51 +09:00
93c3f920ca Merge pull request #4909 from heyitsanthony/pkg-expect
e2e: replace gexpect with simpler expect
2016-03-30 15:36:41 -07:00
eb3351533a godep: remove gexpect 2016-03-30 15:14:24 -07:00
5022dce31a e2e: use pkg/expect 2016-03-30 15:14:24 -07:00
5707f6b997 pkg/expect: add expect package 2016-03-30 15:14:24 -07:00
b539d3a411 test: check formatting for all relevant packages in pkg/ 2016-03-30 15:14:24 -07:00
6cf198d1b1 Merge pull request #4911 from heyitsanthony/physical-already
etcdserver, storage: wait for physical compaction if already compacted
2016-03-30 14:27:21 -07:00
7b37bd332c etcdserver, storage: wait for physical compaction if already compacted 2016-03-30 13:59:52 -07:00
7ce5c2b9ff Merge pull request #4902 from heyitsanthony/alarm-ctl
etcdctl: alarm command
2016-03-30 13:55:29 -07:00
14f146b9f7 Merge pull request #4908 from xiang90/c
*: simplify consistent index handling
2016-03-30 13:53:21 -07:00
eddc741b5e *: simplify consistent index handling 2016-03-30 13:38:28 -07:00
2aca3252e8 etcdctl: alarm command 2016-03-30 13:33:52 -07:00
c91b2d098d clientv3: AlarmList and AlarmDisarm 2016-03-30 13:33:52 -07:00
dd5b73cfee alarms: support Get of all alarms 2016-03-30 13:33:52 -07:00
cd02cef5e9 etcdserver: only warn on new and disarmed alarms
listing alarms was generating warning output
2016-03-30 13:33:52 -07:00
0f64e01f6b Merge pull request #4864 from cdancy/patch-1
Update libraries-and-tools.md
2016-03-30 13:02:09 -07:00
4e2a4b17b5 Documentation: add etcd-rest to libraries-and-tools.md
Add link to the etcd-rest client under the 'Java libraries' sub-section.

Fixes #4906
2016-03-30 15:56:20 -04:00
a5172974da Merge pull request #4863 from heyitsanthony/ft-check-compact
etcd-tester: check compaction revision
2016-03-30 10:08:05 -07:00
1eb375d296 Merge pull request #4880 from gyuho/drain
*: drain http.Response.Body before closing
2016-03-30 10:02:52 -07:00
1bee31a3bb Merge pull request #4905 from gyuho/vendor_doc
*: document client package vendoring guide
2016-03-30 10:02:32 -07:00
4c65f3fe7a etcd-tester: check compaction revision
Faster than waiting 30 seconds between rounds.
2016-03-30 09:45:30 -07:00
4b35cb9462 etcdserver, storage: optionally wait for Compaction completion in RPC 2016-03-30 09:45:30 -07:00
a42d1dc1fe *: drain http.Response.Body before closing 2016-03-30 09:35:47 -07:00
b8d3b15206 *: document client package vendoring guide 2016-03-30 09:34:41 -07:00
12d8d33a1c Merge pull request #4879 from mitake/auth-user-error
etcdserver: return internal error in a case of not auth specific errors
2016-03-30 08:04:41 -07:00
8ee8d755bb etcdserver: return internal error in a case of not auth specific errors 2016-03-30 23:44:22 +09:00
443c677357 etcdserver: extract togRPCError() to a separated file
It is used from multiple files in v3rpc package.
2016-03-30 22:53:20 +09:00
96ee00a322 etcdserverpb: make alarm memberId uint64
To be consistent with Cluster API
2016-03-29 20:15:39 -07:00
2deed74494 Merge pull request #4901 from heyitsanthony/config-dbsize
etcdserver: configurable backend size quota
2016-03-29 18:55:12 -07:00
9b2c963179 etcdserver: configurable backend size quota
Configurable with the flag --experimental-quota-backend-bytes and
through ServerConfig.QuotaBackendBytes.

Fixes #4894
2016-03-29 18:39:25 -07:00
b0956d5dbf Merge pull request #4891 from mitake/auth-prefix
*: add Auth prefix to auth related requests and responses
2016-03-29 17:24:12 -07:00
d00811428d Merge pull request #4898 from gyuho/context_err
client: return context error
2016-03-29 17:22:40 -07:00
8d0d10cce5 client: return original ctx error
Fix https://github.com/coreos/etcd/issues/3209.
2016-03-29 16:57:48 -07:00
00f222ecad Merge pull request #4892 from gyuho/help
etcdmain: add missing flag doc
2016-03-29 10:30:33 -07:00
870b5c5ea7 Merge pull request #4219 from endocode/kayrus/username_environment
Handle ETCDCTL_USERNAME environment
2016-03-29 10:24:43 -07:00
720502b25f etcdctl: Handle ETCDCTL_USERNAME environment 2016-03-29 19:06:31 +02:00
92f4aced25 etcdmain: add peer-auto-tls doc 2016-03-29 09:40:57 -07:00
bb8619f4f7 Merge pull request #4895 from xiang90/client_doc
client: doc that client is thread-safe
2016-03-29 09:36:01 -07:00
9d49d35090 client: doc that client is thread-safe 2016-03-29 09:28:53 -07:00
d533c14881 Merge pull request #4876 from heyitsanthony/integration-races
*: fix races from clientv3/integration tests
2016-03-29 09:10:53 -07:00
75babb82b6 Merge pull request #4888 from xiang90/fix_raft
rafthttp: do not block on proposal
2016-03-29 07:37:18 -07:00
161bc5e19c clientv3: fix race when setting grpc Logger
grpc only permits SetLogger on init()
2016-03-28 23:30:03 -07:00
987568c65c *: add Auth prefix to auth related requests and responses 2016-03-29 14:32:19 +09:00
1637b37132 Merge pull request #4890 from heyitsanthony/fix-4889
clientv3/integration: get quorum before watching in TestKVCompact
2016-03-28 22:30:58 -07:00
096abb3f37 clientv3/integration: get quorum before watching in TestKVCompact
Fixes #4889
2016-03-28 22:18:10 -07:00
660eef8a95 Merge pull request #4872 from ajityagaty/cli_opts_aliases
etcdctl: Add aliases for command flags.
2016-03-28 22:04:00 -07:00
0c137b344b rafthttp: do not block on proposal 2016-03-28 21:40:12 -07:00
2e3856740d etcdctl: Add aliases for command flags.
Add aliases to the flags that are supplied to the sub commands.
2016-03-28 20:57:34 -07:00
c53380cd2a Merge pull request #4886 from heyitsanthony/move-hash
v3rpc: move Hash RPC to Maintenance service
2016-03-28 19:35:03 -07:00
3fbacf4be2 v3rpc: move Hash RPC to Maintenance service 2016-03-28 17:15:58 -07:00
495bef8b4c Merge pull request #4885 from xiang90/log_doc
doc/dev: add logging doc
2016-03-28 17:00:41 -07:00
4bdfc0a46d clientv3: fix race on writing watch channel over return channel
Found in TestElectionFailover
2016-03-28 16:08:18 -07:00
5ee85bea7c v3rpc: fix race on watch progress map
Found in TestElectionWait
2016-03-28 16:08:18 -07:00
813afc3d11 rafthttp: fix race between AddRemote and Send 2016-03-28 16:08:18 -07:00
91dc6b29a6 clientv3/integration: fix race when setting progress report interval 2016-03-28 16:08:18 -07:00
2c83362e63 clientv3: fix race in KV reconnection logic 2016-03-28 16:08:18 -07:00
e129223dbe clientv3: fix race in watcher resume 2016-03-28 16:08:18 -07:00
47db0a2f2e test: add race detection to clientv3 integration tests 2016-03-28 16:08:18 -07:00
ffc7488af2 doc/dev: add logging doc 2016-03-28 15:34:51 -07:00
6e3a0948e4 Merge pull request #4868 from heyitsanthony/api-quota
etcdserver: storage quotas
2016-03-28 15:15:57 -07:00
a403a94d7b etcdserver: cap new keys on space alarm 2016-03-28 14:56:26 -07:00
9e7f47c490 etcdserver: Alarm RPC
Alarms are events that nodes can use to relay health information to
the rest of the cluster. A node may Activate an alarm and that alarm
will stay set until Deactivated.
2016-03-28 14:56:26 -07:00
ae077a2183 backend: add UnsafeForEach to BatchTx
Useful for efficiently iterating over an entire bucket.
2016-03-28 14:56:26 -07:00
9c8253c543 etcdserver, v3rpc: space quotas 2016-03-28 14:56:26 -07:00
fc346041e5 Merge pull request #4883 from heyitsanthony/fix-4874
integration: don't call rand.Intn in TestSTMConflict on 0
2016-03-28 13:36:19 -07:00
94e77cfa5d etcdserver: move v3 raft apply functions to interface 2016-03-28 13:16:21 -07:00
384c3ec907 integration: don't call rand.Intn in TestSTMConflict on 0
Fixes #4874
2016-03-28 13:06:07 -07:00
2b83d9c2e5 Merge pull request #4882 from xiang90/ctl_combine
*: combine etcdctl and etcdctlv3
2016-03-28 11:42:25 -07:00
87d9f06a45 *: combine etcdctl and etcdctlv3 2016-03-28 11:28:05 -07:00
83ada7232a Merge pull request #4871 from gyuho/windows_file_lock_20160326
pkg/fileutil: lock file on Windows
2016-03-27 12:38:38 -07:00
fa98d8d337 Merge pull request #4845 from mitake/auth-user
*: support adding user in v3 auth
2016-03-27 07:51:10 -07:00
8874545a1e *: support adding user in v3 auth
This commit adds a new subcommand "user add" to etcdctlv3. With the
command users can create a user for the authentication.

Example of usage:
$ etcdctlv3 user add user1
Password of user1:
Type password of user1 again for confirmation:
2016-03-27 18:11:42 +09:00
3f1a1c3192 pkg/fileutil: lock file on Windows 2016-03-27 00:35:44 -07:00
68b38e7ade Merge pull request #4875 from gyuho/clientv3_disable_grpclog
clientv3: disable client side grpc log
2016-03-26 22:57:37 -07:00
29fccb3221 clientv3: configurable grpc logger 2016-03-26 22:38:53 -07:00
b8fc61bcec Merge pull request #4869 from ajityagaty/insecure_skip_tls_verify
etcdctlv3: Add insecure-skip-tls-verify flag.
2016-03-26 12:12:55 -07:00
9c3242c6df Merge pull request #4862 from mitake/procfiles
Procfile, V3DemoProcfile: add default endpoint of v3 to Procfile remo…
2016-03-26 08:21:01 -07:00
7418c1af24 V3DemoProcfile: remove the obsolete flag
The flag --experimental-v3demo is already removed so V3DemoProcfile
cannot be used. This commit removes it.
2016-03-26 08:15:58 -07:00
4e39db4158 etcdctlv3: Add insecure-skip-tls-verify flag.
The user can specify insecure-skip-tls-verify flag to skip the
server certificate verification step.
2016-03-25 19:28:41 -07:00
877030ea9d pkg/fileutil: fix linux file locks over NFS
Fixes #4853
2016-03-25 16:28:29 -07:00
36db6cd982 Merge pull request #4867 from xiang90/ctl_env
etcdctlv3: accept evn for global configuration flags
2016-03-25 15:32:06 -07:00
a120ca16c0 etcdctlv3: accept evn for global configuration flags 2016-03-25 14:23:32 -07:00
92a73e727b Merge pull request #4857 from xiang90/warn_tls
etcdmain: warn on contradictory TLS settings
2016-03-25 09:38:11 -07:00
5449edc025 Merge pull request #4817 from mqliang/time-out
etcdctlv3: add timeout support
2016-03-25 07:30:48 -07:00
f165f8b44e etcdctlv3: add timeout support
add timeout setting support for etcdctlv3
2016-03-25 16:24:49 +08:00
20a267dc6a Merge pull request #4860 from heyitsanthony/tester-sched
tools/functional-tester: --schedule-cases flag
2016-03-24 22:05:05 -07:00
4a17097d00 tools/functional-tester: --schedule-cases flag
Command line argument for specifying a schedule of test cases per round.
Default is run each test case once each round.
2016-03-24 19:43:23 -07:00
05dc2dac70 Merge pull request #4859 from xiang90/ctl_secure
etcdctlv3: support secure connection without key/cert
2016-03-24 16:47:23 -07:00
0865688c27 etcdctlv3: support secure connection without key/cert 2016-03-24 16:29:33 -07:00
6285455f85 etcdmain: warn on contradictory TLS settings 2016-03-24 10:21:47 -07:00
7d2545c72e Merge pull request #4856 from xiang90/fail_key_cert
etcdmain: etcd should fail to start when https is enabled but tls con…
2016-03-24 10:10:43 -07:00
5ee3729738 etcdmain: etcd should fail to start when https is enabled but tls config is not given 2016-03-24 09:57:25 -07:00
d16bfa5e54 Merge pull request #4854 from mitake/genproto
scripts: update genproto.sh for vendor
2016-03-24 08:53:55 -07:00
d0d3b32210 Merge pull request #4850 from xiang90/rm_demo
*: enable v3 by default
2016-03-23 23:48:29 -07:00
0436223793 scripts: update genproto.sh for vendor
Current genproto.sh assumes Godep so its output files have obsolete
parts.
2016-03-24 14:30:46 +09:00
70a9391378 *: enable v3 by default 2016-03-23 17:01:36 -07:00
2fec88ebfc Merge pull request #4851 from gyuho/fix_functional_tester
functional-tester: add GRPCURLs for cluster config
2016-03-23 16:47:33 -07:00
9fb60deb7c functional-tester: add GRPCURLs for cluster config
GRPC and v2 client address share the same host(port)
but GRPC does not work with schema specified. This fixes
it by adding another member for GRPC without schema, as
we had before.
2016-03-23 16:28:05 -07:00
333ac5789a Merge pull request #4831 from xiang90/tlx
*: http and https on the same port
2016-03-23 15:59:58 -07:00
e4ac8edb2f Merge pull request #4849 from gyuho/functional_test
functional-tester: set gRPC endpoint for stresser
2016-03-23 15:32:14 -07:00
4d2227e5ab e2e: combine cfg.isClientTLS and cfg.isClientBoth 2016-03-23 15:30:58 -07:00
012143e703 functional-tester: set gRPC endpoint for stresser 2016-03-23 15:23:19 -07:00
9d55420a00 e2e: add an e2e test for TLS/non-TLS on the same port 2016-03-23 13:43:47 -07:00
ca5dff6682 Merge pull request #4848 from heyitsanthony/rename-compare-created
clientv3: rename comparison from CreatedRevision to CreateRevision
2016-03-23 10:32:49 -07:00
900a61b023 *: http and https on the same port 2016-03-23 10:28:38 -07:00
489779d905 clientv3: rename comparison from CreatedRevision to CreateRevision
To match protobuf naming
2016-03-23 09:50:46 -07:00
88e738fcb6 Merge pull request #4844 from ajityagaty/polish_naming_conventions
clientv3: Renaming SortByCreatedRev to maintain consistency.
2016-03-23 09:27:34 -07:00
0181725e55 Merge pull request #4846 from jonboulle/master
docs: "master election" -> "leader election"
2016-03-23 09:27:08 -07:00
6081a29c13 Merge pull request #4843 from heyitsanthony/go-vendor
*: migrate Godeps to vendor/
2016-03-23 06:01:26 -07:00
5f72a28157 docs: "master election" -> "leader election" 2016-03-23 12:23:01 +01:00
86a477c2f6 doc: update client README to use vendor/ 2016-03-22 18:02:10 -07:00
2f22ac662c travis: use GO15VENDOREXPERIMENT 2016-03-22 18:02:10 -07:00
2bb417bfff clientv3: Renaming SortByCreatedRev to maintain consistency.
Renamed SortByCreatedRev to SortByCreateRevision to be consistent
with the naming used for SortByModRevision.
2016-03-22 17:56:24 -07:00
45cf31650c test: ignore vendor/ directory on license check 2016-03-22 17:33:46 -07:00
fb3510b276 build: enable vendor experiment for go1.5 2016-03-22 17:33:46 -07:00
bd832e5b0a *: migrate Godeps to vendor/ 2016-03-22 17:10:28 -07:00
e9b9b228e7 Merge pull request #4842 from gyuho/serial
etcdctlv3: get command with consistency flag
2016-03-22 17:07:29 -07:00
a10662210a e2e: etcdctlv3 with serializable read 2016-03-22 16:52:33 -07:00
5686340d26 etcdctlv3: get command with consistency flag
As we do in benchmark tool.
2016-03-22 16:52:28 -07:00
096a89117a Merge pull request #4840 from ajityagaty/polish_naming_conventions
clientv3: Fix inconsistent naming convention in v3 client.
2016-03-22 15:27:12 -07:00
70e709c5f4 Merge pull request #4812 from xiang90/ping
etcdctlv3: implement endpoint-health command
2016-03-22 15:22:53 -07:00
43221f0b7a etcdctlv3: implement endpoint-health command
endpoint-health checks endpoint.

It can generate 3 outputs:

1. cannot connect to the member through endpoint

2. connected to the member, but member failed to commit any proposals

3. connected to the member, and member committed a proposal
2016-03-22 15:09:50 -07:00
606889a002 clientv3: Fix inconsistent naming convention in v3 client.
In order to have a consistent naming for variable/function names
pertaining to ModifiedRevision, all occurrences have been renamed
to ModRevision.
2016-03-22 14:58:11 -07:00
499b893704 Merge pull request #4838 from gyuho/dial
etcdctlv3: add dial timeout flag
2016-03-22 13:26:19 -07:00
8396da3e83 etcdctlv3: add dial timeout flag
Fix https://github.com/coreos/etcd/issues/4836.
2016-03-22 13:15:26 -07:00
2b44df5440 Merge pull request #4833 from gyuho/govet
etcdmain: fix shadowed variables
2016-03-22 10:01:31 -07:00
565eb61cd3 Merge pull request #4834 from gyuho/godep_pb
Godeps: semantic versioning cheggaaa/pb
2016-03-22 09:09:58 -07:00
a054ae320b Merge pull request #4830 from mischief/proxy-env
pkg/transport: use ProxyFromEnvironment when constructing a transport
2016-03-22 01:14:03 -07:00
afb1bc242b Merge pull request #4822 from mitake/auth-backend
auth, etcdserver: add a method for updating backend during apply snap…
2016-03-21 23:34:46 -07:00
4e39f690f2 auth, etcdserver: add a method for recoverying from backend during apply snapshot
This commit adds a new method Recovery() to auth.AuthStore for
recoverying auth state from backend during apply snapshot. It follows
a manner of the lessor.
2016-03-22 15:17:40 +09:00
bb9a7f5a7c Godeps: semantic versioning cheggaaa/pb
Fix https://github.com/coreos/etcd/issues/4832.
2016-03-21 22:06:16 -07:00
2364d71ea2 etcdmain: fix shadowed variables 2016-03-21 21:55:06 -07:00
d80a546ed4 pkg/transport: use ProxyFromEnvironment when constructing a transport
this allows use of HTTP_PROXY/HTTPS_PROXY for etcdctl.
2016-03-21 21:02:42 -07:00
e73ac5bdd7 Merge pull request #4826 from philips/buildv3-by-default
build: build etcdctlv3 by default
2016-03-21 17:04:11 -07:00
0ac4eba60e Merge pull request #4829 from gyuho/server_closure
etcdmain: fix blocking m.Server closure
2016-03-21 16:50:13 -07:00
cdb7cfd74b etcdmain: fix blocking m.Server closure 2016-03-21 16:39:20 -07:00
7e3fc182d5 Merge pull request #4828 from xiang90/cmux
*: gRPC + HTTP on the same port
2016-03-21 15:37:44 -07:00
7c3432a79f Godep: add cmux dependency 2016-03-21 14:33:37 -07:00
d3809abe42 *: gRPC + HTTP on the same port
We use cmux to do this since we want to do http+https on the same
port in the near future too.
2016-03-21 14:29:25 -07:00
adebd91114 Merge pull request #4785 from heyitsanthony/gce-fallocate
wal: extend WAL file to segment size on fallocate
2016-03-21 13:08:53 -07:00
3fed78ae7b Merge pull request #4484 from heyitsanthony/auto-tls
automatic peer TLS
2016-03-21 12:59:29 -07:00
0df732c052 wal: pre-create segment files
Pipeline file creation and allocation so it overlaps writes to the log.

Fixes #4773
2016-03-21 11:56:53 -07:00
24b806d2ee wal: preallocate WAL files with initial size equal to segment size
Avoids having to update file size metadata during fdatasync on common path.

Fixes #4755
2016-03-21 11:56:53 -07:00
8f653572ac Merge pull request #4827 from xiang90/fix_ctl
etcdctlv3: use godep dir for tablewriter dependency
2016-03-21 11:50:40 -07:00
5a0bb40a41 etcdctlv3: use godep dir for tablewriter dependency 2016-03-21 11:47:55 -07:00
d1ee12566b e2e: test auto tls 2016-03-21 11:44:14 -07:00
7d2aee8eca build: build etcdctlv3 by default
Any reason not to? It makes demoing etcd easier with the V3 procfile.
2016-03-21 11:42:01 -07:00
c8c0c728a0 Merge pull request #4825 from gyuho/key_link
Documentation: add public key link to release doc
2016-03-21 11:39:45 -07:00
e9b2bd751d etcdmain: add --peer-auto-tls option
Lets the peer generate its own (unsigned) certs.
2016-03-21 11:38:23 -07:00
a69c709839 pkg/transport: generate certs 2016-03-21 11:38:23 -07:00
86164374ef Merge pull request #4824 from gyuho/dash
*: replace '-' with '--' in doc
2016-03-21 11:33:22 -07:00
f58d1348a7 Documentation: add public key link to release doc 2016-03-21 11:28:49 -07:00
67c2384bdf *: replace '-' with '--' in doc
Fix https://github.com/coreos/etcd/issues/4595.
2016-03-21 11:12:43 -07:00
aafe717f2f fileutil: support file extending preallocate 2016-03-21 09:42:30 -07:00
7879429a94 Merge pull request #4802 from heyitsanthony/bench-stm
benchmark: STM benchmark
2016-03-20 16:10:54 -07:00
1383da1030 benchmark: STM benchmark 2016-03-20 12:21:29 -07:00
053bc83fe4 Merge pull request #4810 from gyuho/client_serialized_read
clientv3: set Serializable from Op
2016-03-19 14:35:55 -07:00
c740b60db8 Merge pull request #4814 from gyuho/godoc
*: minor updates for godoc
2016-03-19 14:29:48 -07:00
dae7e009b0 *: godoc clean up 2016-03-19 14:19:23 -07:00
0555a6112d Merge pull request #4813 from gyuho/clientv3
clientv3/concurrency: fix godoc
2016-03-18 19:07:54 -07:00
dc9af8e3f3 Merge pull request #4807 from gyuho/go1.4-build
Drop go1.4 support for development
2016-03-18 19:07:44 -07:00
21b33de810 rafthttp: drop go1.4 tests 2016-03-18 18:46:11 -07:00
5bba773199 pkg/testutil: drop go1.4 goroutine leak exception 2016-03-18 18:45:47 -07:00
0a82c06a2c pkg/types: drop go1.4 tests 2016-03-18 18:45:29 -07:00
33e22fa8d7 pkg/httputil: drop go1.4 tests 2016-03-18 18:45:12 -07:00
25e47db416 client: drop go1.4 tests 2016-03-18 18:44:56 -07:00
896cba5cb9 README: go1.5 for Go development 2016-03-18 18:44:40 -07:00
6aa17f0c76 Merge pull request #4775 from heyitsanthony/wal-locks
fileutil, wal: refactor file locking
2016-03-18 18:26:31 -07:00
16270dba4f Merge pull request #4805 from gyuho/drop-go1.4
travis: drop go1.4
2016-03-18 18:24:55 -07:00
4e4f0ab619 clientv3/concurrency: fix godoc 2016-03-18 16:34:58 -07:00
ac9376ea16 *: bump to v2.3.0+git 2016-03-18 16:32:04 -07:00
f38a611b55 clientv3: set Serializable from Op
Fix https://github.com/coreos/etcd/issues/4809.
2016-03-18 15:56:48 -07:00
5badbab8b7 travis: drop go1.4
Fix https://github.com/coreos/etcd/issues/4790.
2016-03-18 13:33:12 -07:00
7397e14c0a fileutil, wal: refactor file locking
File lock interface was more verbose than it needed to be while
simultaneously making it difficult to support systems (e.g., Windows)
that only permit locked writes on a single fd holding the lock.
2016-03-16 15:02:15 -07:00
a44645e13d etcdserver: align 64-bit atomics on 8-byte boundary 2016-03-10 07:24:33 +02:00
7ba352d9ca etcdmain: print usage in stderr when flag.Parse fail
This fits the requirement of stderr.

I still let `etcd --version` and `etcd --help` print out to stdout
because when users ask explicitly for version/help docs, they expect to see
the doc in stdout.

Ref:
http://www.jstorimer.com/blogs/workingwithcode/7766119-when-to-use-stderr-instead-of-stdout
2015-09-30 14:19:39 -07:00
2083 changed files with 185186 additions and 105449 deletions

7
.github/ISSUE_TEMPLATE.md vendored Normal file
View File

@ -0,0 +1,7 @@
# Bug reporting
A good bug report has some very specific qualities, so please read over our short document on [reporting bugs][report_bugs] before submitting a bug report.
To ask a question, go ahead and ignore this.
[report_bugs]: https://github.com/coreos/etcd/blob/master/Documentation/reporting_bugs.md

5
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@ -0,0 +1,5 @@
# Contributing guidelines
Please read our [contribution workflow][contributing] before submitting a pull request.
[contributing]: https://github.com/coreos/etcd/blob/master/CONTRIBUTING.md#contribution-flow

3
.gitignore vendored
View File

@ -1,5 +1,6 @@
/coverage
/gopath
/gopath.proto
/go-bindata
/machine*
/bin
@ -10,3 +11,5 @@
/hack/insta-discovery/.env
*.test
tools/functional-tester/docker/bin
hack/tls-setup/certs
.idea

View File

@ -1,4 +1,4 @@
// Copyright 2016 CoreOS, Inc.
// Copyright 2016 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.

View File

@ -1,25 +1,83 @@
dist: trusty
language: go
go_import_path: github.com/coreos/etcd
sudo: false
go:
- 1.4
- 1.5
- 1.6
- 1.8.3
- tip
notifications:
on_success: never
on_failure: never
env:
matrix:
- TARGET=amd64
- TARGET=darwin-amd64
- TARGET=windows-amd64
- TARGET=arm64
- TARGET=arm
- TARGET=386
- TARGET=ppc64le
matrix:
fast_finish: true
allow_failures:
- go: tip
exclude:
- go: tip
env: TARGET=darwin-amd64
- go: tip
env: TARGET=windows-amd64
- go: tip
env: TARGET=arm
- go: tip
env: TARGET=arm64
- go: tip
env: TARGET=386
- go: tip
env: TARGET=ppc64le
addons:
apt:
sources:
- debian-sid
packages:
- libpcap-dev
- libaspell-dev
- libhunspell-dev
- shellcheck
before_install:
- go get -v github.com/chzchzchz/goword
- go get -v -u github.com/chzchzchz/goword
- go get -v -u github.com/coreos/license-bill-of-materials
- go get -v -u honnef.co/go/tools/cmd/gosimple
- go get -v -u honnef.co/go/tools/cmd/unused
- go get -v -u honnef.co/go/tools/cmd/staticcheck
- ./scripts/install-marker.sh amd64
# disable godep restore override
install:
- pushd cmd/etcd && go get -t -v ./... && popd
script:
- ./test
- >
case "${TARGET}" in
amd64)
GOARCH=amd64 ./test
;;
darwin-amd64)
GO_BUILD_FLAGS="-a -v" GOPATH="" GOOS=darwin GOARCH=amd64 ./build
;;
windows-amd64)
GO_BUILD_FLAGS="-a -v" GOPATH="" GOOS=windows GOARCH=amd64 ./build
;;
386)
GOARCH=386 PASSES="build unit" ./test
;;
*)
# test building out of gopath
GO_BUILD_FLAGS="-a -v" GOPATH="" GOARCH="${TARGET}" ./build
;;
esac

View File

@ -1,6 +1,6 @@
# How to contribute
etcd is Apache 2.0 licensed and accepts contributions via GitHub pull requests. This document outlines some of the conventions on commit message formatting, contact points for developers and other resources to make getting your contribution into etcd easier.
etcd is Apache 2.0 licensed and accepts contributions via GitHub pull requests. This document outlines some of the conventions on commit message formatting, contact points for developers, and other resources to help get contributions into etcd.
# Email and chat
@ -12,26 +12,22 @@ etcd is Apache 2.0 licensed and accepts contributions via GitHub pull requests.
- Fork the repository on GitHub
- Read the README.md for build instructions
## Reporting Bugs and Creating Issues
## Reporting bugs and creating issues
Reporting bugs is one of the best ways to contribute. However, a good bug report
has some very specific qualities, so please read over our short document on
[reporting bugs](https://github.com/coreos/etcd/blob/master/Documentation/reporting_bugs.md)
before you submit your bug report. This document might contain links known
issues, another good reason to take a look there, before reporting your bug.
Reporting bugs is one of the best ways to contribute. However, a good bug report has some very specific qualities, so please read over our short document on [reporting bugs](https://github.com/coreos/etcd/blob/master/Documentation/reporting_bugs.md) before submitting a bug report. This document might contain links to known issues, another good reason to take a look there before reporting a bug.
## Contribution flow
This is a rough outline of what a contributor's workflow looks like:
- Create a topic branch from where you want to base your work. This is usually master.
- Create a topic branch from where to base the contribution. This is usually master.
- Make commits of logical units.
- Make sure your commit messages are in the proper format (see below).
- Push your changes to a topic branch in your fork of the repository.
- Make sure commit messages are in the proper format (see below).
- Push changes in a topic branch to a personal fork of the repository.
- Submit a pull request to coreos/etcd.
- Your PR must receive a LGTM from two maintainers found in the MAINTAINERS file.
- The PR must receive a LGTM from two maintainers found in the MAINTAINERS file.
Thanks for your contributions!
Thanks for contributing!
### Code style
@ -39,7 +35,7 @@ The coding style suggested by the Golang community is used in etcd. See the [sty
Please follow this style to make etcd easy to review, maintain and develop.
### Format of the Commit Message
### Format of the commit message
We follow a rough convention for commit messages that is designed to answer two
questions: what changed and why. The subject line should feature the what and
@ -48,8 +44,7 @@ the body of the commit should describe the why.
```
scripts: add the test-cluster command
this uses tmux to setup a test cluster that you can easily kill and
start for debugging.
this uses tmux to setup a test cluster that can easily be killed and started for debugging.
Fixes #38
```
@ -64,7 +59,4 @@ The format can be described more formally as follows:
<footer>
```
The first line is the subject and should be no longer than 70 characters, the
second line is always blank, and other lines should be wrapped at 80 characters.
This allows the message to be easier to read on GitHub as well as in various
git tools.
The first line is the subject and should be no longer than 70 characters, the second line is always blank, and other lines should be wrapped at 80 characters. This allows the message to be easier to read on GitHub as well as in various git tools.

View File

@ -1,2 +1,6 @@
FROM golang:onbuild
EXPOSE 4001 7001 2379 2380
FROM golang
ADD . /go/src/github.com/coreos/etcd
ADD cmd/vendor /go/src/github.com/coreos/etcd/vendor
RUN go install github.com/coreos/etcd
EXPOSE 2379 2380
ENTRYPOINT ["etcd"]

17
Dockerfile-release Normal file
View File

@ -0,0 +1,17 @@
FROM alpine:latest
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/
RUN mkdir -p /var/etcd/
RUN mkdir -p /var/lib/etcd/
# Alpine Linux doesn't use pam, which means that there is no /etc/nsswitch.conf,
# but Golang relies on /etc/nsswitch.conf to check the order of DNS resolving
# (see https://github.com/golang/go/commit/9dee7771f561cf6aee081c0af6658cc81fac3918)
# To fix this we just create /etc/nsswitch.conf and add the following line:
RUN echo 'hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4' >> /etc/nsswitch.conf
EXPOSE 2379 2380
# Define default command.
CMD ["/usr/local/bin/etcd"]

11
Dockerfile-release.arm64 Normal file
View File

@ -0,0 +1,11 @@
FROM aarch64/ubuntu:16.04
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/
ADD var/etcd /var/etcd
ADD var/lib/etcd /var/lib/etcd
EXPOSE 2379 2380
# Define default command.
CMD ["/usr/local/bin/etcd"]

View File

@ -0,0 +1,11 @@
FROM ppc64le/ubuntu:16.04
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/
ADD var/etcd /var/etcd
ADD var/lib/etcd /var/lib/etcd
EXPOSE 2379 2380
# Define default command.
CMD ["/usr/local/bin/etcd"]

1
Documentation/README.md Symbolic link
View File

@ -0,0 +1 @@
docs.md

View File

@ -14,7 +14,7 @@ GCE n1-highcpu-2 machine type
## Testing
Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
Bootstrap another machine and use the [hey HTTP benchmark tool][hey] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
## Performance
@ -48,5 +48,5 @@ Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send r
| 256 | 64 | all servers | 1033 | 121.5 |
| 256 | 256 | all servers | 3061 | 119.3 |
[boom]: https://github.com/rakyll/boom
[hack-benchmark]: /hack/benchmark/
[hey]: https://github.com/rakyll/hey
[hack-benchmark]: https://github.com/coreos/etcd/tree/master/hack/benchmark

View File

@ -24,7 +24,7 @@ Go OS/Arch: linux/amd64
## Testing
Bootstrap another machine, outside of the etcd cluster, and run the [`boom` HTTP benchmark tool](https://github.com/rakyll/boom) with a connection reuse patch to send requests to each etcd cluster member. See the [benchmark instructions](../../hack/benchmark/) for the patch and the steps to reproduce our procedures.
Bootstrap another machine, outside of the etcd cluster, and run the [`hey` HTTP benchmark tool](https://github.com/rakyll/hey) with a connection reuse patch to send requests to each etcd cluster member. See the [benchmark instructions](../../hack/benchmark/) for the patch and the steps to reproduce our procedures.
The performance is calulated through results of 100 benchmark rounds.
@ -66,4 +66,4 @@ The performance is calulated through results of 100 benchmark rounds.
- Write QPS to cluster leaders seems to be increased by a small margin. This is because the main loop and entry apply loops were decoupled in the etcd raft logic, eliminating several blocks between them.
- Write QPS to all members seems to be increased by a significant margin, because followers now receive the latest commit index sooner, and commit proposals more quickly.
- Write QPS to all members seems to be increased by a significant margin, because followers now receive the latest commit index sooner, and commit proposals more quickly.

View File

@ -24,7 +24,7 @@ Also, we use 3 etcd 2.1.0 alpha-stage members to form cluster to get base perfor
## Testing
Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
Bootstrap another machine and use the [hey HTTP benchmark tool][hey] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
## Performance
@ -66,7 +66,7 @@ Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send r
- write QPS to all servers is increased by 30~80% because follower could receive latest commit index earlier and commit proposals faster.
[boom]: https://github.com/rakyll/boom
[hey]: https://github.com/rakyll/hey
[c7146bd5]: https://github.com/coreos/etcd/commits/c7146bd5f2c73716091262edc638401bb8229144
[etcd-2.1-benchmark]: etcd-2-1-0-alpha-benchmarks.md
[hack-benchmark]: /hack/benchmark/
[hack-benchmark]: ../../hack/benchmark/

View File

@ -39,7 +39,7 @@ The length of key name is always 64 bytes, which is a reasonable length of avera
## Data Size Threshold
- When etcd reaches data size threshold, it may trigger leader election easily and drop part of proposals.
- At most cases, etcd cluster should work smoothly if it doesn't hit the threshold. If it doesn't work well due to insufficient resources, you need to decrease its data size.
- For most cases, the etcd cluster should work smoothly if it doesn't hit the threshold. If it doesn't work well due to insufficient resources, decrease its data size.
| value bytes | key number limitation | suggested data size threshold(MB) | consumed RSS(MB) |
|-------------|-----------------------|-----------------------------------|------------------|

View File

@ -39,4 +39,4 @@ The performance is nearly the same as the one with empty server handler.
The performance with empty server handler is not affected by one put. So the
performance downgrade should be caused by storage package.
[etcd-v3-benchmark]: /tools/benchmark/
[etcd-v3-benchmark]: ../../tools/benchmark/

View File

@ -72,6 +72,6 @@ With the benchmark result, we can calculate roughly that `c1 = 17kb`, `c2 = 18kb
| 5k | 50 | 10 | 2.5M | 5710MB |
| 1k | 50 | 100 | 5M | 2380MB |
| 2k | 50 | 100 | 10M | 4672MB |
| 5k | 50 | 100 | 50M | *OOM* |
| 5k | 50 | 100 | 25M | *OOM* |
[rss]: https://en.wikipedia.org/wiki/Resident_set_size

View File

@ -1,4 +1,4 @@
# Branch Management
# Branch management
## Guide
@ -13,7 +13,7 @@ The etcd team has adopted a *rolling release model* and supports one stable vers
The `master` branch is our development branch. All new features land here first.
If you want to try new features, pull `master` and play with it. Note that `master` may not be stable because new features may introduce bugs.
To try new and experimental features, pull `master` and play with it. Note that `master` may not be stable because new features may introduce bugs.
Before the release of the next stable version, feature PRs will be frozen. We will focus on the testing, bug-fix and documentation for one to two weeks.

454
Documentation/demo.md Normal file
View File

@ -0,0 +1,454 @@
# Demo
This series of examples shows the basic procedures for working with an etcd cluster.
## Set up a cluster
<img src="https://storage.googleapis.com/etcd/demo/01_etcd_clustering_2016051001.gif" alt="01_etcd_clustering_2016050601"/>
On each etcd node, specify the cluster members:
```
TOKEN=token-01
CLUSTER_STATE=new
NAME_1=machine-1
NAME_2=machine-2
NAME_3=machine-3
HOST_1=10.240.0.17
HOST_2=10.240.0.18
HOST_3=10.240.0.19
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380
```
Run this on each machine:
```
# For machine 1
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
# For machine 2
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
# For machine 3
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
```
Or use our public discovery service:
```
curl https://discovery.etcd.io/new?size=3
https://discovery.etcd.io/a81b5818e67a6ea83e9d4daea5ecbc92
# grab this token
TOKEN=token-01
CLUSTER_STATE=new
NAME_1=machine-1
NAME_2=machine-2
NAME_3=machine-3
HOST_1=10.240.0.17
HOST_2=10.240.0.18
HOST_3=10.240.0.19
DISCOVERY=https://discovery.etcd.io/a81b5818e67a6ea83e9d4daea5ecbc92
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--discovery ${DISCOVERY} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--discovery ${DISCOVERY} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--discovery ${DISCOVERY} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
```
Now etcd is ready! To connect to etcd with etcdctl:
```
export ETCDCTL_API=3
HOST_1=10.240.0.17
HOST_2=10.240.0.18
HOST_3=10.240.0.19
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379
etcdctl --endpoints=$ENDPOINTS member list
```
## Access etcd
<img src="https://storage.googleapis.com/etcd/demo/02_etcdctl_access_etcd_2016051001.gif" alt="02_etcdctl_access_etcd_2016051001"/>
`put` command to write:
```
etcdctl --endpoints=$ENDPOINTS put foo "Hello World!"
```
`get` to read from etcd:
```
etcdctl --endpoints=$ENDPOINTS get foo
etcdctl --endpoints=$ENDPOINTS --write-out="json" get foo
```
## Get by prefix
<img src="https://storage.googleapis.com/etcd/demo/03_etcdctl_get_by_prefix_2016050501.gif" alt="03_etcdctl_get_by_prefix_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS put web1 value1
etcdctl --endpoints=$ENDPOINTS put web2 value2
etcdctl --endpoints=$ENDPOINTS put web3 value3
etcdctl --endpoints=$ENDPOINTS get web --prefix
```
## Delete
<img src="https://storage.googleapis.com/etcd/demo/04_etcdctl_delete_2016050601.gif" alt="04_etcdctl_delete_2016050601"/>
```
etcdctl --endpoints=$ENDPOINTS put key myvalue
etcdctl --endpoints=$ENDPOINTS del key
etcdctl --endpoints=$ENDPOINTS put k1 value1
etcdctl --endpoints=$ENDPOINTS put k2 value2
etcdctl --endpoints=$ENDPOINTS del k --prefix
```
## Transactional write
`txn` to wrap multiple requests into one transaction:
<img src="https://storage.googleapis.com/etcd/demo/05_etcdctl_transaction_2016050501.gif" alt="05_etcdctl_transaction_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS put user1 bad
etcdctl --endpoints=$ENDPOINTS txn --interactive
compares:
value("user1") = "bad"
success requests (get, put, delete):
del user1
failure requests (get, put, delete):
put user1 good
```
## Watch
`watch` to get notified of future changes:
<img src="https://storage.googleapis.com/etcd/demo/06_etcdctl_watch_2016050501.gif" alt="06_etcdctl_watch_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS watch stock1
etcdctl --endpoints=$ENDPOINTS put stock1 1000
etcdctl --endpoints=$ENDPOINTS watch stock --prefix
etcdctl --endpoints=$ENDPOINTS put stock1 10
etcdctl --endpoints=$ENDPOINTS put stock2 20
```
## Lease
`lease` to write with TTL:
<img src="https://storage.googleapis.com/etcd/demo/07_etcdctl_lease_2016050501.gif" alt="07_etcdctl_lease_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS lease grant 300
# lease 2be7547fbc6a5afa granted with TTL(300s)
etcdctl --endpoints=$ENDPOINTS put sample value --lease=2be7547fbc6a5afa
etcdctl --endpoints=$ENDPOINTS get sample
etcdctl --endpoints=$ENDPOINTS lease keep-alive 2be7547fbc6a5afa
etcdctl --endpoints=$ENDPOINTS lease revoke 2be7547fbc6a5afa
# or after 300 seconds
etcdctl --endpoints=$ENDPOINTS get sample
```
## Distributed locks
`lock` for distributed lock:
<img src="https://storage.googleapis.com/etcd/demo/08_etcdctl_lock_2016050501.gif" alt="08_etcdctl_lock_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS lock mutex1
# another client with the same name blocks
etcdctl --endpoints=$ENDPOINTS lock mutex1
```
## Elections
`elect` for leader election:
<img src="https://storage.googleapis.com/etcd/demo/09_etcdctl_elect_2016050501.gif" alt="09_etcdctl_elect_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS elect one p1
# another client with the same name blocks
etcdctl --endpoints=$ENDPOINTS elect one p2
```
## Cluster status
Specify the initial cluster configuration for each machine:
<img src="https://storage.googleapis.com/etcd/demo/10_etcdctl_endpoint_2016050501.gif" alt="10_etcdctl_endpoint_2016050501"/>
```
etcdctl --write-out=table --endpoints=$ENDPOINTS endpoint status
+------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+------------------+------------------+---------+---------+-----------+-----------+------------+
| 10.240.0.17:2379 | 4917a7ab173fabe7 | 3.0.0 | 45 kB | true | 4 | 16726 |
| 10.240.0.18:2379 | 59796ba9cd1bcd72 | 3.0.0 | 45 kB | false | 4 | 16726 |
| 10.240.0.19:2379 | 94df724b66343e6c | 3.0.0 | 45 kB | false | 4 | 16726 |
+------------------+------------------+---------+---------+-----------+-----------+------------+
```
```
etcdctl --endpoints=$ENDPOINTS endpoint health
10.240.0.17:2379 is healthy: successfully committed proposal: took = 3.345431ms
10.240.0.19:2379 is healthy: successfully committed proposal: took = 3.767967ms
10.240.0.18:2379 is healthy: successfully committed proposal: took = 4.025451ms
```
## Snapshot
`snapshot` to save point-in-time snapshot of etcd database:
<img src="https://storage.googleapis.com/etcd/demo/11_etcdctl_snapshot_2016051001.gif" alt="11_etcdctl_snapshot_2016051001"/>
```
etcdctl --endpoints=$ENDPOINTS snapshot save my.db
Snapshot saved at my.db
```
```
etcdctl --write-out=table --endpoints=$ENDPOINTS snapshot status my.db
+---------+----------+------------+------------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+---------+----------+------------+------------+
| c55e8b8 | 9 | 13 | 25 kB |
+---------+----------+------------+------------+
```
## Migrate
`migrate` to transform etcd v2 to v3 data:
<img src="https://storage.googleapis.com/etcd/demo/12_etcdctl_migrate_2016061602.gif" alt="12_etcdctl_migrate_2016061602"/>
```
# write key in etcd version 2 store
export ETCDCTL_API=2
etcdctl --endpoints=http://$ENDPOINT set foo bar
# read key in etcd v2
etcdctl --endpoints=$ENDPOINTS --output="json" get foo
# stop etcd node to migrate, one by one
# migrate v2 data
export ETCDCTL_API=3
etcdctl --endpoints=$ENDPOINT migrate --data-dir="default.etcd" --wal-dir="default.etcd/member/wal"
# restart etcd node after migrate, one by one
# confirm that the key got migrated
etcdctl --endpoints=$ENDPOINTS get /foo
```
## Member
`member` to add,remove,update membership:
<img src="https://storage.googleapis.com/etcd/demo/13_etcdctl_member_2016062301.gif" alt="13_etcdctl_member_2016062301"/>
```
# For each machine
TOKEN=my-etcd-token-1
CLUSTER_STATE=new
NAME_1=etcd-node-1
NAME_2=etcd-node-2
NAME_3=etcd-node-3
HOST_1=10.240.0.13
HOST_2=10.240.0.14
HOST_3=10.240.0.15
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380
# For node 1
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 \
--listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 \
--listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} \
--initial-cluster-token ${TOKEN}
# For node 2
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 \
--listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 \
--listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} \
--initial-cluster-token ${TOKEN}
# For node 3
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 \
--listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 \
--listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} \
--initial-cluster-token ${TOKEN}
```
Then replace a member with `member remove` and `member add` commands:
```
# get member ID
export ETCDCTL_API=3
HOST_1=10.240.0.13
HOST_2=10.240.0.14
HOST_3=10.240.0.15
etcdctl --endpoints=${HOST_1}:2379,${HOST_2}:2379,${HOST_3}:2379 member list
# remove the member
MEMBER_ID=278c654c9a6dfd3b
etcdctl --endpoints=${HOST_1}:2379,${HOST_2}:2379,${HOST_3}:2379 \
member remove ${MEMBER_ID}
# add a new member (node 4)
export ETCDCTL_API=3
NAME_1=etcd-node-1
NAME_2=etcd-node-2
NAME_4=etcd-node-4
HOST_1=10.240.0.13
HOST_2=10.240.0.14
HOST_4=10.240.0.16 # new member
etcdctl --endpoints=${HOST_1}:2379,${HOST_2}:2379 \
member add ${NAME_4} \
--peer-urls=http://${HOST_4}:2380
```
Next, start the new member with `--initial-cluster-state existing` flag:
```
# [WARNING] If the new member starts from the same disk space,
# make sure to remove the data directory of the old member
#
# restart with 'existing' flag
TOKEN=my-etcd-token-1
CLUSTER_STATE=existing
NAME_1=etcd-node-1
NAME_2=etcd-node-2
NAME_4=etcd-node-4
HOST_1=10.240.0.13
HOST_2=10.240.0.14
HOST_4=10.240.0.16 # new member
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_4}=http://${HOST_4}:2380
THIS_NAME=${NAME_4}
THIS_IP=${HOST_4}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 \
--listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 \
--listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} \
--initial-cluster-token ${TOKEN}
```
## Auth
`auth`,`user`,`role` for authentication:
<img src="https://storage.googleapis.com/etcd/demo/14_etcdctl_auth_2016062301.gif" alt="14_etcdctl_auth_2016062301"/>
```
export ETCDCTL_API=3
ENDPOINTS=localhost:2379
etcdctl --endpoints=${ENDPOINTS} role add root
etcdctl --endpoints=${ENDPOINTS} role grant-permission root readwrite foo
etcdctl --endpoints=${ENDPOINTS} role get root
etcdctl --endpoints=${ENDPOINTS} user add root
etcdctl --endpoints=${ENDPOINTS} user grant-role root root
etcdctl --endpoints=${ENDPOINTS} user get root
etcdctl --endpoints=${ENDPOINTS} auth enable
# now all client requests go through auth
etcdctl --endpoints=${ENDPOINTS} --user=root:123 put foo bar
etcdctl --endpoints=${ENDPOINTS} get foo
etcdctl --endpoints=${ENDPOINTS} --user=root:123 get foo
etcdctl --endpoints=${ENDPOINTS} --user=root:123 get foo1
```

View File

@ -0,0 +1,168 @@
### etcd concurrency API Reference
This is a generated documentation. Please read the proto files for more.
##### service `Lock` (etcdserver/api/v3lock/v3lockpb/v3lock.proto)
The lock service exposes client-side locking facilities as a gRPC interface.
| Method | Request Type | Response Type | Description |
| ------ | ------------ | ------------- | ----------- |
| Lock | LockRequest | LockResponse | Lock acquires a distributed shared lock on a given named lock. On success, it will return a unique key that exists so long as the lock is held by the caller. This key can be used in conjunction with transactions to safely ensure updates to etcd only occur while holding lock ownership. The lock is held until Unlock is called on the key or the lease associate with the owner expires. |
| Unlock | UnlockRequest | UnlockResponse | Unlock takes a key returned by Lock and releases the hold on lock. The next Lock caller waiting for the lock will then be woken up and given ownership of the lock. |
##### message `LockRequest` (etcdserver/api/v3lock/v3lockpb/v3lock.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the identifier for the distributed shared lock to be acquired. | bytes |
| lease | lease is the ID of the lease that will be attached to ownership of the lock. If the lease expires or is revoked and currently holds the lock, the lock is automatically released. Calls to Lock with the same lease will be treated as a single acquistion; locking twice with the same lease is a no-op. | int64 |
##### message `LockResponse` (etcdserver/api/v3lock/v3lockpb/v3lock.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
| key | key is a key that will exist on etcd for the duration that the Lock caller owns the lock. Users should not modify this key or the lock may exhibit undefined behavior. | bytes |
##### message `UnlockRequest` (etcdserver/api/v3lock/v3lockpb/v3lock.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the lock ownership key granted by Lock. | bytes |
##### message `UnlockResponse` (etcdserver/api/v3lock/v3lockpb/v3lock.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
##### service `Election` (etcdserver/api/v3election/v3electionpb/v3election.proto)
The election service exposes client-side election facilities as a gRPC interface.
| Method | Request Type | Response Type | Description |
| ------ | ------------ | ------------- | ----------- |
| Campaign | CampaignRequest | CampaignResponse | Campaign waits to acquire leadership in an election, returning a LeaderKey representing the leadership if successful. The LeaderKey can then be used to issue new values on the election, transactionally guard API requests on leadership still being held, and resign from the election. |
| Proclaim | ProclaimRequest | ProclaimResponse | Proclaim updates the leader's posted value with a new value. |
| Leader | LeaderRequest | LeaderResponse | Leader returns the current election proclamation, if any. |
| Observe | LeaderRequest | LeaderResponse | Observe streams election proclamations in-order as made by the election's elected leaders. |
| Resign | ResignRequest | ResignResponse | Resign releases election leadership so other campaigners may acquire leadership on the election. |
##### message `CampaignRequest` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the election's identifier for the campaign. | bytes |
| lease | lease is the ID of the lease attached to leadership of the election. If the lease expires or is revoked before resigning leadership, then the leadership is transferred to the next campaigner, if any. | int64 |
| value | value is the initial proclaimed value set when the campaigner wins the election. | bytes |
##### message `CampaignResponse` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
| leader | leader describes the resources used for holding leadereship of the election. | LeaderKey |
##### message `LeaderKey` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the election identifier that correponds to the leadership key. | bytes |
| key | key is an opaque key representing the ownership of the election. If the key is deleted, then leadership is lost. | bytes |
| rev | rev is the creation revision of the key. It can be used to test for ownership of an election during transactions by testing the key's creation revision matches rev. | int64 |
| lease | lease is the lease ID of the election leader. | int64 |
##### message `LeaderRequest` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the election identifier for the leadership information. | bytes |
##### message `LeaderResponse` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
| kv | kv is the key-value pair representing the latest leader update. | mvccpb.KeyValue |
##### message `ProclaimRequest` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| leader | leader is the leadership hold on the election. | LeaderKey |
| value | value is an update meant to overwrite the leader's current value. | bytes |
##### message `ProclaimResponse` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
##### message `ResignRequest` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| leader | leader is the leadership to relinquish by resignation. | LeaderKey |
##### message `ResignResponse` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
##### message `Event` (mvcc/mvccpb/kv.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| type | type is the kind of event. If type is a PUT, it indicates new data has been stored to the key. If type is a DELETE, it indicates the key was deleted. | EventType |
| kv | kv holds the KeyValue for the event. A PUT event contains current kv pair. A PUT event with kv.Version=1 indicates the creation of a key. A DELETE/EXPIRE event contains the deleted key with its modification revision set to the revision of deletion. | KeyValue |
| prev_kv | prev_kv holds the key-value pair before the event happens. | KeyValue |
##### message `KeyValue` (mvcc/mvccpb/kv.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the key in bytes. An empty key is not allowed. | bytes |
| create_revision | create_revision is the revision of last creation on this key. | int64 |
| mod_revision | mod_revision is the revision of last modification on this key. | int64 |
| version | version is the version of the key. A deletion resets the version to zero and any modification of the key increases its version. | int64 |
| value | value is the value held by the key, in bytes. | bytes |
| lease | lease is the ID of the lease that attached to key. When the attached lease expires, the key will be deleted. If lease is 0, then no lease is attached to the key. | int64 |

View File

@ -0,0 +1,53 @@
## Why grpc-gateway
etcd v3 uses [gRPC][grpc] for its messaging protocol. The etcd project includes a gRPC-based [Go client][go-client] and a command line utility, [etcdctl][etcdctl], for communicating with an etcd cluster through gRPC. For languages with no gRPC support, etcd provides a JSON [grpc-gateway][grpc-gateway]. This gateway serves a RESTful proxy that translates HTTP/JSON requests into gRPC messages.
## Using grpc-gateway
The gateway accepts a [JSON mapping][json-mapping] for etcd's [protocol buffer][api-ref] message definitions. Note that `key` and `value` fields are defined as byte arrays and therefore must be base64 encoded in JSON.
Use `curl` to put and get a key:
```bash
<<COMMENT
https://www.base64encode.org/
foo is 'Zm9v' in Base64
bar is 'YmFy'
COMMENT
curl -L http://localhost:2379/v3alpha/kv/put \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"3"}}
curl -L http://localhost:2379/v3alpha/kv/range \
-X POST -d '{"key": "Zm9v"}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"3"},"kvs":[{"key":"Zm9v","create_revision":"2","mod_revision":"2","version":"1","value":"YmFy"}],"count":"1"}
```
Use `curl` to watch a key:
```bash
curl http://localhost:2379/v3alpha/watch \
-X POST -d '{"create_request": {"key":"Zm9v"} }' &
# {"result":{"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"1","raft_term":"2"},"created":true}}
curl -L http://localhost:2379/v3alpha/kv/put \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}' >/dev/null 2>&1
# {"result":{"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"2"},"events":[{"kv":{"key":"Zm9v","create_revision":"2","mod_revision":"2","version":"1","value":"YmFy"}}]}}
```
## Swagger
Generated [Swagger][swagger] API definitions can be found at [rpc.swagger.json][swagger-doc].
[api-ref]: ./api_reference_v3.md
[go-client]: https://github.com/coreos/etcd/tree/master/clientv3
[etcdctl]: https://github.com/coreos/etcd/tree/master/etcdctl
[grpc]: http://www.grpc.io/
[grpc-gateway]: https://github.com/grpc-ecosystem/grpc-gateway
[json-mapping]: https://developers.google.com/protocol-buffers/docs/proto3#json
[swagger]: http://swagger.io/
[swagger-doc]: apispec/swagger/rpc.swagger.json

View File

@ -0,0 +1,880 @@
### etcd API Reference
This is a generated documentation. Please read the proto files for more.
##### service `Auth` (etcdserver/etcdserverpb/rpc.proto)
| Method | Request Type | Response Type | Description |
| ------ | ------------ | ------------- | ----------- |
| AuthEnable | AuthEnableRequest | AuthEnableResponse | AuthEnable enables authentication. |
| AuthDisable | AuthDisableRequest | AuthDisableResponse | AuthDisable disables authentication. |
| Authenticate | AuthenticateRequest | AuthenticateResponse | Authenticate processes an authenticate request. |
| UserAdd | AuthUserAddRequest | AuthUserAddResponse | UserAdd adds a new user. |
| UserGet | AuthUserGetRequest | AuthUserGetResponse | UserGet gets detailed user information. |
| UserList | AuthUserListRequest | AuthUserListResponse | UserList gets a list of all users. |
| UserDelete | AuthUserDeleteRequest | AuthUserDeleteResponse | UserDelete deletes a specified user. |
| UserChangePassword | AuthUserChangePasswordRequest | AuthUserChangePasswordResponse | UserChangePassword changes the password of a specified user. |
| UserGrantRole | AuthUserGrantRoleRequest | AuthUserGrantRoleResponse | UserGrant grants a role to a specified user. |
| UserRevokeRole | AuthUserRevokeRoleRequest | AuthUserRevokeRoleResponse | UserRevokeRole revokes a role of specified user. |
| RoleAdd | AuthRoleAddRequest | AuthRoleAddResponse | RoleAdd adds a new role. |
| RoleGet | AuthRoleGetRequest | AuthRoleGetResponse | RoleGet gets detailed role information. |
| RoleList | AuthRoleListRequest | AuthRoleListResponse | RoleList gets lists of all roles. |
| RoleDelete | AuthRoleDeleteRequest | AuthRoleDeleteResponse | RoleDelete deletes a specified role. |
| RoleGrantPermission | AuthRoleGrantPermissionRequest | AuthRoleGrantPermissionResponse | RoleGrantPermission grants a permission of a specified key or range to a specified role. |
| RoleRevokePermission | AuthRoleRevokePermissionRequest | AuthRoleRevokePermissionResponse | RoleRevokePermission revokes a key or range permission of a specified role. |
##### service `Cluster` (etcdserver/etcdserverpb/rpc.proto)
| Method | Request Type | Response Type | Description |
| ------ | ------------ | ------------- | ----------- |
| MemberAdd | MemberAddRequest | MemberAddResponse | MemberAdd adds a member into the cluster. |
| MemberRemove | MemberRemoveRequest | MemberRemoveResponse | MemberRemove removes an existing member from the cluster. |
| MemberUpdate | MemberUpdateRequest | MemberUpdateResponse | MemberUpdate updates the member configuration. |
| MemberList | MemberListRequest | MemberListResponse | MemberList lists all the members in the cluster. |
##### service `KV` (etcdserver/etcdserverpb/rpc.proto)
| Method | Request Type | Response Type | Description |
| ------ | ------------ | ------------- | ----------- |
| Range | RangeRequest | RangeResponse | Range gets the keys in the range from the key-value store. |
| Put | PutRequest | PutResponse | Put puts the given key into the key-value store. A put request increments the revision of the key-value store and generates one event in the event history. |
| DeleteRange | DeleteRangeRequest | DeleteRangeResponse | DeleteRange deletes the given range from the key-value store. A delete request increments the revision of the key-value store and generates a delete event in the event history for every deleted key. |
| Txn | TxnRequest | TxnResponse | Txn processes multiple requests in a single transaction. A txn request increments the revision of the key-value store and generates events with the same revision for every completed request. It is not allowed to modify the same key several times within one txn. |
| Compact | CompactionRequest | CompactionResponse | Compact compacts the event history in the etcd key-value store. The key-value store should be periodically compacted or the event history will continue to grow indefinitely. |
##### service `Lease` (etcdserver/etcdserverpb/rpc.proto)
| Method | Request Type | Response Type | Description |
| ------ | ------------ | ------------- | ----------- |
| LeaseGrant | LeaseGrantRequest | LeaseGrantResponse | LeaseGrant creates a lease which expires if the server does not receive a keepAlive within a given time to live period. All keys attached to the lease will be expired and deleted if the lease expires. Each expired key generates a delete event in the event history. |
| LeaseRevoke | LeaseRevokeRequest | LeaseRevokeResponse | LeaseRevoke revokes a lease. All keys attached to the lease will expire and be deleted. |
| LeaseKeepAlive | LeaseKeepAliveRequest | LeaseKeepAliveResponse | LeaseKeepAlive keeps the lease alive by streaming keep alive requests from the client to the server and streaming keep alive responses from the server to the client. |
| LeaseTimeToLive | LeaseTimeToLiveRequest | LeaseTimeToLiveResponse | LeaseTimeToLive retrieves lease information. |
##### service `Maintenance` (etcdserver/etcdserverpb/rpc.proto)
| Method | Request Type | Response Type | Description |
| ------ | ------------ | ------------- | ----------- |
| Alarm | AlarmRequest | AlarmResponse | Alarm activates, deactivates, and queries alarms regarding cluster health. |
| Status | StatusRequest | StatusResponse | Status gets the status of the member. |
| Defragment | DefragmentRequest | DefragmentResponse | Defragment defragments a member's backend database to recover storage space. |
| Hash | HashRequest | HashResponse | Hash returns the hash of the local KV state for consistency checking purpose. This is designed for testing; do not use this in production when there are ongoing transactions. |
| Snapshot | SnapshotRequest | SnapshotResponse | Snapshot sends a snapshot of the entire backend from a member over a stream to a client. |
##### service `Watch` (etcdserver/etcdserverpb/rpc.proto)
| Method | Request Type | Response Type | Description |
| ------ | ------------ | ------------- | ----------- |
| Watch | WatchRequest | WatchResponse | Watch watches for events happening or that have happened. Both input and output are streams; the input stream is for creating and canceling watchers and the output stream sends events. One watch RPC can watch on multiple key ranges, streaming events for several watches at once. The entire event history can be watched starting from the last compaction revision. |
##### message `AlarmMember` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| memberID | memberID is the ID of the member associated with the raised alarm. | uint64 |
| alarm | alarm is the type of alarm which has been raised. | AlarmType |
##### message `AlarmRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| action | action is the kind of alarm request to issue. The action may GET alarm statuses, ACTIVATE an alarm, or DEACTIVATE a raised alarm. | AlarmAction |
| memberID | memberID is the ID of the member associated with the alarm. If memberID is 0, the alarm request covers all members. | uint64 |
| alarm | alarm is the type of alarm to consider for this request. | AlarmType |
##### message `AlarmResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| alarms | alarms is a list of alarms associated with the alarm request. | (slice of) AlarmMember |
##### message `AuthDisableRequest` (etcdserver/etcdserverpb/rpc.proto)
Empty field.
##### message `AuthDisableResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthEnableRequest` (etcdserver/etcdserverpb/rpc.proto)
Empty field.
##### message `AuthEnableResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthRoleAddRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the name of the role to add to the authentication system. | string |
##### message `AuthRoleAddResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthRoleDeleteRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| role | | string |
##### message `AuthRoleDeleteResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthRoleGetRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| role | | string |
##### message `AuthRoleGetResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| perm | | (slice of) authpb.Permission |
##### message `AuthRoleGrantPermissionRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the name of the role which will be granted the permission. | string |
| perm | perm is the permission to grant to the role. | authpb.Permission |
##### message `AuthRoleGrantPermissionResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthRoleListRequest` (etcdserver/etcdserverpb/rpc.proto)
Empty field.
##### message `AuthRoleListResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| roles | | (slice of) string |
##### message `AuthRoleRevokePermissionRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| role | | string |
| key | | string |
| range_end | | string |
##### message `AuthRoleRevokePermissionResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthUserAddRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | | string |
| password | | string |
##### message `AuthUserAddResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthUserChangePasswordRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the name of the user whose password is being changed. | string |
| password | password is the new password for the user. | string |
##### message `AuthUserChangePasswordResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthUserDeleteRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the name of the user to delete. | string |
##### message `AuthUserDeleteResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthUserGetRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | | string |
##### message `AuthUserGetResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| roles | | (slice of) string |
##### message `AuthUserGrantRoleRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| user | user is the name of the user which should be granted a given role. | string |
| role | role is the name of the role to grant to the user. | string |
##### message `AuthUserGrantRoleResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthUserListRequest` (etcdserver/etcdserverpb/rpc.proto)
Empty field.
##### message `AuthUserListResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| users | | (slice of) string |
##### message `AuthUserRevokeRoleRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | | string |
| role | | string |
##### message `AuthUserRevokeRoleResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `AuthenticateRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | | string |
| password | | string |
##### message `AuthenticateResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| token | token is an authorized token that can be used in succeeding RPCs | string |
##### message `CompactionRequest` (etcdserver/etcdserverpb/rpc.proto)
CompactionRequest compacts the key-value store up to a given revision. All superseded keys with a revision less than the compaction revision will be removed.
| Field | Description | Type |
| ----- | ----------- | ---- |
| revision | revision is the key-value store revision for the compaction operation. | int64 |
| physical | physical is set so the RPC will wait until the compaction is physically applied to the local database such that compacted entries are totally removed from the backend database. | bool |
##### message `CompactionResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `Compare` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| result | result is logical comparison operation for this comparison. | CompareResult |
| target | target is the key-value field to inspect for the comparison. | CompareTarget |
| key | key is the subject key for the comparison operation. | bytes |
| target_union | | oneof |
| version | version is the version of the given key | int64 |
| create_revision | create_revision is the creation revision of the given key | int64 |
| mod_revision | mod_revision is the last modified revision of the given key. | int64 |
| value | value is the value of the given key, in bytes. | bytes |
##### message `DefragmentRequest` (etcdserver/etcdserverpb/rpc.proto)
Empty field.
##### message `DefragmentResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `DeleteRangeRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the first key to delete in the range. | bytes |
| range_end | range_end is the key following the last key to delete for the range [key, range_end). If range_end is not given, the range is defined to contain only the key argument. If range_end is one bit larger than the given key, then the range is all the keys with the prefix (the given key). If range_end is '\0', the range is all keys greater than or equal to the key argument. | bytes |
| prev_kv | If prev_kv is set, etcd gets the previous key-value pairs before deleting it. The previous key-value pairs will be returned in the delete response. | bool |
##### message `DeleteRangeResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| deleted | deleted is the number of keys deleted by the delete range request. | int64 |
| prev_kvs | if prev_kv is set in the request, the previous key-value pairs will be returned. | (slice of) mvccpb.KeyValue |
##### message `HashRequest` (etcdserver/etcdserverpb/rpc.proto)
Empty field.
##### message `HashResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| hash | hash is the hash value computed from the responding member's key-value store. | uint32 |
##### message `LeaseGrantRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| TTL | TTL is the advisory time-to-live in seconds. | int64 |
| ID | ID is the requested ID for the lease. If ID is set to 0, the lessor chooses an ID. | int64 |
##### message `LeaseGrantResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| ID | ID is the lease ID for the granted lease. | int64 |
| TTL | TTL is the server chosen lease time-to-live in seconds. | int64 |
| error | | string |
##### message `LeaseKeepAliveRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| ID | ID is the lease ID for the lease to keep alive. | int64 |
##### message `LeaseKeepAliveResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| ID | ID is the lease ID from the keep alive request. | int64 |
| TTL | TTL is the new time-to-live for the lease. | int64 |
##### message `LeaseRevokeRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| ID | ID is the lease ID to revoke. When the ID is revoked, all associated keys will be deleted. | int64 |
##### message `LeaseRevokeResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
##### message `LeaseTimeToLiveRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| ID | ID is the lease ID for the lease. | int64 |
| keys | keys is true to query all the keys attached to this lease. | bool |
##### message `LeaseTimeToLiveResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| ID | ID is the lease ID from the keep alive request. | int64 |
| TTL | TTL is the remaining TTL in seconds for the lease; the lease will expire in under TTL+1 seconds. | int64 |
| grantedTTL | GrantedTTL is the initial granted time in seconds upon lease creation/renewal. | int64 |
| keys | Keys is the list of keys attached to this lease. | (slice of) bytes |
##### message `Member` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| ID | ID is the member ID for this member. | uint64 |
| name | name is the human-readable name of the member. If the member is not started, the name will be an empty string. | string |
| peerURLs | peerURLs is the list of URLs the member exposes to the cluster for communication. | (slice of) string |
| clientURLs | clientURLs is the list of URLs the member exposes to clients for communication. If the member is not started, clientURLs will be empty. | (slice of) string |
##### message `MemberAddRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| peerURLs | peerURLs is the list of URLs the added member will use to communicate with the cluster. | (slice of) string |
##### message `MemberAddResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| member | member is the member information for the added member. | Member |
| members | members is a list of all members after adding the new member. | (slice of) Member |
##### message `MemberListRequest` (etcdserver/etcdserverpb/rpc.proto)
Empty field.
##### message `MemberListResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| members | members is a list of all members associated with the cluster. | (slice of) Member |
##### message `MemberRemoveRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| ID | ID is the member ID of the member to remove. | uint64 |
##### message `MemberRemoveResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| members | members is a list of all members after removing the member. | (slice of) Member |
##### message `MemberUpdateRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| ID | ID is the member ID of the member to update. | uint64 |
| peerURLs | peerURLs is the new list of URLs the member will use to communicate with the cluster. | (slice of) string |
##### message `MemberUpdateResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| members | members is a list of all members after updating the member. | (slice of) Member |
##### message `PutRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the key, in bytes, to put into the key-value store. | bytes |
| value | value is the value, in bytes, to associate with the key in the key-value store. | bytes |
| lease | lease is the lease ID to associate with the key in the key-value store. A lease value of 0 indicates no lease. | int64 |
| prev_kv | If prev_kv is set, etcd gets the previous key-value pair before changing it. The previous key-value pair will be returned in the put response. | bool |
| ignore_value | If ignore_value is set, etcd updates the key using its current value. Returns an error if the key does not exist. | bool |
| ignore_lease | If ignore_lease is set, etcd updates the key using its current lease. Returns an error if the key does not exist. | bool |
##### message `PutResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| prev_kv | if prev_kv is set in the request, the previous key-value pair will be returned. | mvccpb.KeyValue |
##### message `RangeRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the first key for the range. If range_end is not given, the request only looks up key. | bytes |
| range_end | range_end is the upper bound on the requested range [key, range_end). If range_end is '\0', the range is all keys >= key. If range_end is key plus one (e.g., "aa"+1 == "ab", "a\xff"+1 == "b"), then the range request gets all keys prefixed with key. If both key and range_end are '\0', then the range request returns all keys. | bytes |
| limit | limit is a limit on the number of keys returned for the request. When limit is set to 0, it is treated as no limit. | int64 |
| revision | revision is the point-in-time of the key-value store to use for the range. If revision is less or equal to zero, the range is over the newest key-value store. If the revision has been compacted, ErrCompacted is returned as a response. | int64 |
| sort_order | sort_order is the order for returned sorted results. | SortOrder |
| sort_target | sort_target is the key-value field to use for sorting. | SortTarget |
| serializable | serializable sets the range request to use serializable member-local reads. Range requests are linearizable by default; linearizable requests have higher latency and lower throughput than serializable requests but reflect the current consensus of the cluster. For better performance, in exchange for possible stale reads, a serializable range request is served locally without needing to reach consensus with other nodes in the cluster. | bool |
| keys_only | keys_only when set returns only the keys and not the values. | bool |
| count_only | count_only when set returns only the count of the keys in the range. | bool |
| min_mod_revision | min_mod_revision is the lower bound for returned key mod revisions; all keys with lesser mod revisions will be filtered away. | int64 |
| max_mod_revision | max_mod_revision is the upper bound for returned key mod revisions; all keys with greater mod revisions will be filtered away. | int64 |
| min_create_revision | min_create_revision is the lower bound for returned key create revisions; all keys with lesser create trevisions will be filtered away. | int64 |
| max_create_revision | max_create_revision is the upper bound for returned key create revisions; all keys with greater create revisions will be filtered away. | int64 |
##### message `RangeResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| kvs | kvs is the list of key-value pairs matched by the range request. kvs is empty when count is requested. | (slice of) mvccpb.KeyValue |
| more | more indicates if there are more keys to return in the requested range. | bool |
| count | count is set to the number of keys within the range when requested. | int64 |
##### message `RequestOp` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| request | request is a union of request types accepted by a transaction. | oneof |
| request_range | | RangeRequest |
| request_put | | PutRequest |
| request_delete_range | | DeleteRangeRequest |
##### message `ResponseHeader` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| cluster_id | cluster_id is the ID of the cluster which sent the response. | uint64 |
| member_id | member_id is the ID of the member which sent the response. | uint64 |
| revision | revision is the key-value store revision when the request was applied. | int64 |
| raft_term | raft_term is the raft term when the request was applied. | uint64 |
##### message `ResponseOp` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| response | response is a union of response types returned by a transaction. | oneof |
| response_range | | RangeResponse |
| response_put | | PutResponse |
| response_delete_range | | DeleteRangeResponse |
##### message `SnapshotRequest` (etcdserver/etcdserverpb/rpc.proto)
Empty field.
##### message `SnapshotResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | header has the current key-value store information. The first header in the snapshot stream indicates the point in time of the snapshot. | ResponseHeader |
| remaining_bytes | remaining_bytes is the number of blob bytes to be sent after this message | uint64 |
| blob | blob contains the next chunk of the snapshot in the snapshot stream. | bytes |
##### message `StatusRequest` (etcdserver/etcdserverpb/rpc.proto)
Empty field.
##### message `StatusResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| version | version is the cluster protocol version used by the responding member. | string |
| dbSize | dbSize is the size of the backend database, in bytes, of the responding member. | int64 |
| leader | leader is the member ID which the responding member believes is the current leader. | uint64 |
| raftIndex | raftIndex is the current raft index of the responding member. | uint64 |
| raftTerm | raftTerm is the current raft term of the responding member. | uint64 |
##### message `TxnRequest` (etcdserver/etcdserverpb/rpc.proto)
From google paxosdb paper: Our implementation hinges around a powerful primitive which we call MultiOp. All other database operations except for iteration are implemented as a single call to MultiOp. A MultiOp is applied atomically and consists of three components: 1. A list of tests called guard. Each test in guard checks a single entry in the database. It may check for the absence or presence of a value, or compare with a given value. Two different tests in the guard may apply to the same or different entries in the database. All tests in the guard are applied and MultiOp returns the results. If all tests are true, MultiOp executes t op (see item 2 below), otherwise it executes f op (see item 3 below). 2. A list of database operations called t op. Each operation in the list is either an insert, delete, or lookup operation, and applies to a single database entry. Two different operations in the list may apply to the same or different entries in the database. These operations are executed if guard evaluates to true. 3. A list of database operations called f op. Like t op, but executed if guard evaluates to false.
| Field | Description | Type |
| ----- | ----------- | ---- |
| compare | compare is a list of predicates representing a conjunction of terms. If the comparisons succeed, then the success requests will be processed in order, and the response will contain their respective responses in order. If the comparisons fail, then the failure requests will be processed in order, and the response will contain their respective responses in order. | (slice of) Compare |
| success | success is a list of requests which will be applied when compare evaluates to true. | (slice of) RequestOp |
| failure | failure is a list of requests which will be applied when compare evaluates to false. | (slice of) RequestOp |
##### message `TxnResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| succeeded | succeeded is set to true if the compare evaluated to true or false otherwise. | bool |
| responses | responses is a list of responses corresponding to the results from applying success if succeeded is true or failure if succeeded is false. | (slice of) ResponseOp |
##### message `WatchCancelRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| watch_id | watch_id is the watcher id to cancel so that no more events are transmitted. | int64 |
##### message `WatchCreateRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the key to register for watching. | bytes |
| range_end | range_end is the end of the range [key, range_end) to watch. If range_end is not given, only the key argument is watched. If range_end is equal to '\0', all keys greater than or equal to the key argument are watched. If the range_end is one bit larger than the given key, then all keys with the prefix (the given key) will be watched. | bytes |
| start_revision | start_revision is an optional revision to watch from (inclusive). No start_revision is "now". | int64 |
| progress_notify | progress_notify is set so that the etcd server will periodically send a WatchResponse with no events to the new watcher if there are no recent events. It is useful when clients wish to recover a disconnected watcher starting from a recent known revision. The etcd server may decide how often it will send notifications based on current load. | bool |
| filters | filters filter the events at server side before it sends back to the watcher. | (slice of) FilterType |
| prev_kv | If prev_kv is set, created watcher gets the previous KV before the event happens. If the previous KV is already compacted, nothing will be returned. | bool |
##### message `WatchRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| request_union | request_union is a request to either create a new watcher or cancel an existing watcher. | oneof |
| create_request | | WatchCreateRequest |
| cancel_request | | WatchCancelRequest |
##### message `WatchResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| watch_id | watch_id is the ID of the watcher that corresponds to the response. | int64 |
| created | created is set to true if the response is for a create watch request. The client should record the watch_id and expect to receive events for the created watcher from the same stream. All events sent to the created watcher will attach with the same watch_id. | bool |
| canceled | canceled is set to true if the response is for a cancel watch request. No further events will be sent to the canceled watcher. | bool |
| compact_revision | compact_revision is set to the minimum index if a watcher tries to watch at a compacted index. This happens when creating a watcher at a compacted revision or the watcher cannot catch up with the progress of the key-value store. The client should treat the watcher as canceled and should not try to create any watcher with the same start_revision again. | int64 |
| cancel_reason | cancel_reason indicates the reason for canceling the watcher. | string |
| events | | (slice of) mvccpb.Event |
##### message `Event` (mvcc/mvccpb/kv.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| type | type is the kind of event. If type is a PUT, it indicates new data has been stored to the key. If type is a DELETE, it indicates the key was deleted. | EventType |
| kv | kv holds the KeyValue for the event. A PUT event contains current kv pair. A PUT event with kv.Version=1 indicates the creation of a key. A DELETE/EXPIRE event contains the deleted key with its modification revision set to the revision of deletion. | KeyValue |
| prev_kv | prev_kv holds the key-value pair before the event happens. | KeyValue |
##### message `KeyValue` (mvcc/mvccpb/kv.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the key in bytes. An empty key is not allowed. | bytes |
| create_revision | create_revision is the revision of last creation on this key. | int64 |
| mod_revision | mod_revision is the revision of last modification on this key. | int64 |
| version | version is the version of the key. A deletion resets the version to zero and any modification of the key increases its version. | int64 |
| value | value is the value held by the key, in bytes. | bytes |
| lease | lease is the ID of the lease that attached to key. When the attached lease expires, the key will be deleted. If lease is 0, then no lease is attached to the key. | int64 |
##### message `Lease` (lease/leasepb/lease.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| ID | | int64 |
| TTL | | int64 |
##### message `LeaseInternalRequest` (lease/leasepb/lease.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| LeaseTimeToLiveRequest | | etcdserverpb.LeaseTimeToLiveRequest |
##### message `LeaseInternalResponse` (lease/leasepb/lease.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| LeaseTimeToLiveResponse | | etcdserverpb.LeaseTimeToLiveResponse |
##### message `Permission` (auth/authpb/auth.proto)
Permission is a single entity
| Field | Description | Type |
| ----- | ----------- | ---- |
| permType | | Type |
| key | | bytes |
| range_end | | bytes |
##### message `Role` (auth/authpb/auth.proto)
Role is a single entry in the bucket authRoles
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | | bytes |
| keyPermission | | (slice of) Permission |
##### message `User` (auth/authpb/auth.proto)
User is a single entry in the bucket authUsers
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | | bytes |
| password | | bytes |
| roles | | (slice of) string |

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,334 @@
{
"swagger": "2.0",
"info": {
"title": "etcdserver/api/v3election/v3electionpb/v3election.proto",
"version": "version not set"
},
"schemes": [
"http",
"https"
],
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"paths": {
"/v3alpha/election/campaign": {
"post": {
"summary": "Campaign waits to acquire leadership in an election, returning a LeaderKey\nrepresenting the leadership if successful. The LeaderKey can then be used\nto issue new values on the election, transactionally guard API requests on\nleadership still being held, and resign from the election.",
"operationId": "Campaign",
"responses": {
"200": {
"description": "",
"schema": {
"$ref": "#/definitions/v3electionpbCampaignResponse"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3electionpbCampaignRequest"
}
}
],
"tags": [
"Election"
]
}
},
"/v3alpha/election/leader": {
"post": {
"summary": "Leader returns the current election proclamation, if any.",
"operationId": "Leader",
"responses": {
"200": {
"description": "",
"schema": {
"$ref": "#/definitions/v3electionpbLeaderResponse"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3electionpbLeaderRequest"
}
}
],
"tags": [
"Election"
]
}
},
"/v3alpha/election/observe": {
"post": {
"summary": "Observe streams election proclamations in-order as made by the election's\nelected leaders.",
"operationId": "Observe",
"responses": {
"200": {
"description": "(streaming responses)",
"schema": {
"$ref": "#/definitions/v3electionpbLeaderResponse"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3electionpbLeaderRequest"
}
}
],
"tags": [
"Election"
]
}
},
"/v3alpha/election/proclaim": {
"post": {
"summary": "Proclaim updates the leader's posted value with a new value.",
"operationId": "Proclaim",
"responses": {
"200": {
"description": "",
"schema": {
"$ref": "#/definitions/v3electionpbProclaimResponse"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3electionpbProclaimRequest"
}
}
],
"tags": [
"Election"
]
}
},
"/v3alpha/election/resign": {
"post": {
"summary": "Resign releases election leadership so other campaigners may acquire\nleadership on the election.",
"operationId": "Resign",
"responses": {
"200": {
"description": "",
"schema": {
"$ref": "#/definitions/v3electionpbResignResponse"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3electionpbResignRequest"
}
}
],
"tags": [
"Election"
]
}
}
},
"definitions": {
"etcdserverpbResponseHeader": {
"type": "object",
"properties": {
"cluster_id": {
"type": "string",
"format": "uint64",
"description": "cluster_id is the ID of the cluster which sent the response."
},
"member_id": {
"type": "string",
"format": "uint64",
"description": "member_id is the ID of the member which sent the response."
},
"revision": {
"type": "string",
"format": "int64",
"description": "revision is the key-value store revision when the request was applied."
},
"raft_term": {
"type": "string",
"format": "uint64",
"description": "raft_term is the raft term when the request was applied."
}
}
},
"mvccpbKeyValue": {
"type": "object",
"properties": {
"key": {
"type": "string",
"format": "byte",
"description": "key is the key in bytes. An empty key is not allowed."
},
"create_revision": {
"type": "string",
"format": "int64",
"description": "create_revision is the revision of last creation on this key."
},
"mod_revision": {
"type": "string",
"format": "int64",
"description": "mod_revision is the revision of last modification on this key."
},
"version": {
"type": "string",
"format": "int64",
"description": "version is the version of the key. A deletion resets\nthe version to zero and any modification of the key\nincreases its version."
},
"value": {
"type": "string",
"format": "byte",
"description": "value is the value held by the key, in bytes."
},
"lease": {
"type": "string",
"format": "int64",
"description": "lease is the ID of the lease that attached to key.\nWhen the attached lease expires, the key will be deleted.\nIf lease is 0, then no lease is attached to the key."
}
}
},
"v3electionpbCampaignRequest": {
"type": "object",
"properties": {
"name": {
"type": "string",
"format": "byte",
"description": "name is the election's identifier for the campaign."
},
"lease": {
"type": "string",
"format": "int64",
"description": "lease is the ID of the lease attached to leadership of the election. If the\nlease expires or is revoked before resigning leadership, then the\nleadership is transferred to the next campaigner, if any."
},
"value": {
"type": "string",
"format": "byte",
"description": "value is the initial proclaimed value set when the campaigner wins the\nelection."
}
}
},
"v3electionpbCampaignResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
},
"leader": {
"$ref": "#/definitions/v3electionpbLeaderKey",
"description": "leader describes the resources used for holding leadereship of the election."
}
}
},
"v3electionpbLeaderKey": {
"type": "object",
"properties": {
"name": {
"type": "string",
"format": "byte",
"description": "name is the election identifier that correponds to the leadership key."
},
"key": {
"type": "string",
"format": "byte",
"description": "key is an opaque key representing the ownership of the election. If the key\nis deleted, then leadership is lost."
},
"rev": {
"type": "string",
"format": "int64",
"description": "rev is the creation revision of the key. It can be used to test for ownership\nof an election during transactions by testing the key's creation revision\nmatches rev."
},
"lease": {
"type": "string",
"format": "int64",
"description": "lease is the lease ID of the election leader."
}
}
},
"v3electionpbLeaderRequest": {
"type": "object",
"properties": {
"name": {
"type": "string",
"format": "byte",
"description": "name is the election identifier for the leadership information."
}
}
},
"v3electionpbLeaderResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
},
"kv": {
"$ref": "#/definitions/mvccpbKeyValue",
"description": "kv is the key-value pair representing the latest leader update."
}
}
},
"v3electionpbProclaimRequest": {
"type": "object",
"properties": {
"leader": {
"$ref": "#/definitions/v3electionpbLeaderKey",
"description": "leader is the leadership hold on the election."
},
"value": {
"type": "string",
"format": "byte",
"description": "value is an update meant to overwrite the leader's current value."
}
}
},
"v3electionpbProclaimResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
}
}
},
"v3electionpbResignRequest": {
"type": "object",
"properties": {
"leader": {
"$ref": "#/definitions/v3electionpbLeaderKey",
"description": "leader is the leadership to relinquish by resignation."
}
}
},
"v3electionpbResignResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
}
}
}
}
}

View File

@ -0,0 +1,146 @@
{
"swagger": "2.0",
"info": {
"title": "etcdserver/api/v3lock/v3lockpb/v3lock.proto",
"version": "version not set"
},
"schemes": [
"http",
"https"
],
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"paths": {
"/v3alpha/lock/lock": {
"post": {
"summary": "Lock acquires a distributed shared lock on a given named lock.\nOn success, it will return a unique key that exists so long as the\nlock is held by the caller. This key can be used in conjunction with\ntransactions to safely ensure updates to etcd only occur while holding\nlock ownership. The lock is held until Unlock is called on the key or the\nlease associate with the owner expires.",
"operationId": "Lock",
"responses": {
"200": {
"description": "",
"schema": {
"$ref": "#/definitions/v3lockpbLockResponse"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3lockpbLockRequest"
}
}
],
"tags": [
"Lock"
]
}
},
"/v3alpha/lock/unlock": {
"post": {
"summary": "Unlock takes a key returned by Lock and releases the hold on lock. The\nnext Lock caller waiting for the lock will then be woken up and given\nownership of the lock.",
"operationId": "Unlock",
"responses": {
"200": {
"description": "",
"schema": {
"$ref": "#/definitions/v3lockpbUnlockResponse"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3lockpbUnlockRequest"
}
}
],
"tags": [
"Lock"
]
}
}
},
"definitions": {
"etcdserverpbResponseHeader": {
"type": "object",
"properties": {
"cluster_id": {
"type": "string",
"format": "uint64",
"description": "cluster_id is the ID of the cluster which sent the response."
},
"member_id": {
"type": "string",
"format": "uint64",
"description": "member_id is the ID of the member which sent the response."
},
"revision": {
"type": "string",
"format": "int64",
"description": "revision is the key-value store revision when the request was applied."
},
"raft_term": {
"type": "string",
"format": "uint64",
"description": "raft_term is the raft term when the request was applied."
}
}
},
"v3lockpbLockRequest": {
"type": "object",
"properties": {
"name": {
"type": "string",
"format": "byte",
"description": "name is the identifier for the distributed shared lock to be acquired."
},
"lease": {
"type": "string",
"format": "int64",
"description": "lease is the ID of the lease that will be attached to ownership of the\nlock. If the lease expires or is revoked and currently holds the lock,\nthe lock is automatically released. Calls to Lock with the same lease will\nbe treated as a single acquistion; locking twice with the same lease is a\nno-op."
}
}
},
"v3lockpbLockResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
},
"key": {
"type": "string",
"format": "byte",
"description": "key is a key that will exist on etcd for the duration that the Lock caller\nowns the lock. Users should not modify this key or the lock may exhibit\nundefined behavior."
}
}
},
"v3lockpbUnlockRequest": {
"type": "object",
"properties": {
"key": {
"type": "string",
"format": "byte",
"description": "key is the lock ownership key granted by Lock."
}
}
},
"v3lockpbUnlockResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
}
}
}
}
}

View File

@ -0,0 +1,11 @@
# Experimental APIs and features
For the most part, the etcd project is stable, but we are still moving fast! We believe in the release fast philosophy. We want to get early feedback on features still in development and stabilizing. Thus, there are, and will be more, experimental features and APIs. We plan to improve these features based on the early feedback from the community, or abandon them if there is little interest, in the next few releases. Please do not rely on any experimental features or APIs in production environment.
## The current experimental API/features are:
- [gateway][gateway]: beta, to be stable in 3.2 release
- [gRPC proxy][grpc-proxy]: alpha, to be stable in 3.2 release
[gateway]: ../op-guide/gateway.md
[grpc-proxy]: ../op-guide/grpc_proxy.md

View File

@ -0,0 +1,65 @@
# gRPC naming and discovery
etcd provides a gRPC resolver to support an alternative name system that fetches endpoints from etcd for discovering gRPC services. The underlying mechanism is based on watching updates to keys prefixed with the service name.
## Using etcd discovery with go-grpc
The etcd client provides a gRPC resolver for resolving gRPC endpoints with an etcd backend. The resolver is initialized with an etcd client and given a target for resolution:
```go
import (
"github.com/coreos/etcd/clientv3"
etcdnaming "github.com/coreos/etcd/clientv3/naming"
"google.golang.org/grpc"
)
...
cli, cerr := clientv3.NewFromURL("http://localhost:2379")
r := &etcdnaming.GRPCResolver{Client: cli}
b := grpc.RoundRobin(r)
conn, gerr := grpc.Dial("my-service", grpc.WithBalancer(b))
```
## Managing service endpoints
The etcd resolver treats all keys under the prefix of the resolution target following a "/" (e.g., "my-service/") with JSON-encoded go-grpc `naming.Update` values as potential service endpoints. Endpoints are added to the service by creating new keys and removed from the service by deleting keys.
### Adding an endpoint
New endpoints can be added to the service through `etcdctl`:
```sh
ETCDCTL_API=3 etcdctl put my-service/1.2.3.4 '{"Addr":"1.2.3.4","Metadata":"..."}'
```
The etcd client's `GRPCResolver.Update` method can also register new endpoints with a key matching the `Addr`:
```go
r.Update(context.TODO(), "my-service", naming.Update{Op: naming.Add, Addr: "1.2.3.4", Metadata: "..."})
```
### Deleting an endpoint
Hosts can be deleted from the service through `etcdctl`:
```sh
ETCDCTL_API=3 etcdctl del my-service/1.2.3.4
```
The etcd client's `GRPCResolver.Update` method also supports deleting endpoints:
```go
r.Update(context.TODO(), "my-service", naming.Update{Op: naming.Delete, Addr: "1.2.3.4"})
```
### Registering an endpoint with a lease
Registering an endpoint with a lease ensures that if the host can't maintain a keepalive heartbeat (e.g., its machine fails), it will be removed from the service:
```sh
lease=`ETCDCTL_API=3 etcdctl lease grant 5 | cut -f2 -d' '`
ETCDCTL_API=3 etcdctl put --lease=$lease my-service/1.2.3.4 '{"Addr":"1.2.3.4","Metadata":"..."}'
ETCDCTL_API=3 etcdctl lease keep-alive $lease
```

View File

@ -0,0 +1,475 @@
# Interacting with etcd
Users mostly interact with etcd by putting or getting the value of a key. This section describes how to do that by using etcdctl, a command line tool for interacting with etcd server. The concepts described here should apply to the gRPC APIs or client library APIs.
By default, etcdctl talks to the etcd server with the v2 API for backward compatibility. For etcdctl to speak to etcd using the v3 API, the API version must be set to version 3 via the `ETCDCTL_API` environment variable.
```bash
export ETCDCTL_API=3
```
## Find versions
etcdctl version and Server API version can be useful in finding the appropriate commands to be used for performing various operations on etcd.
Here is the command to find the versions:
```bash
$ etcdctl version
etcdctl version: 3.1.0-alpha.0+git
API version: 3.1
```
## Write a key
Applications store keys into the etcd cluster by writing to keys. Every stored key is replicated to all etcd cluster members through the Raft protocol to achieve consistency and reliability.
Here is the command to set the value of key `foo` to `bar`:
```bash
$ etcdctl put foo bar
OK
```
Also a key can be set for a specified interval of time by attaching lease to it.
Here is the command to set the value of key `foo1` to `bar1` for 10s.
```bash
$ etcdctl put foo1 bar1 --lease=1234abcd
OK
```
Note: The lease id `1234abcd` in the above command refers to id returned on creating the lease of 10s. This id can then be attached to the key.
## Read keys
Applications can read values of keys from an etcd cluster. Queries may read a single key, or a range of keys.
Suppose the etcd cluster has stored the following keys:
```bash
foo = bar
foo1 = bar1
foo2 = bar2
foo3 = bar3
```
Here is the command to read the value of key `foo`:
```bash
$ etcdctl get foo
foo
bar
```
Here is the command to read the value of key `foo` in hex format:
```bash
$ etcdctl get foo --hex
\x66\x6f\x6f # Key
\x62\x61\x72 # Value
```
Here is the command to read only the value of key `foo`:
```bash
$ etcdctl get foo --print-value-only
bar
```
Here is the command to range over the keys from `foo` to `foo3`:
```bash
$ etcdctl get foo foo3
foo
bar
foo1
bar1
foo2
bar2
```
Note that `foo3` is excluded since the range is over the half-open interval `[foo, foo3)`, excluding `foo3`.
Here is the command to range over all keys prefixed with `foo`:
```bash
$ etcdctl get --prefix foo
foo
bar
foo1
bar1
foo2
bar2
foo3
bar3
```
Here is the command to range over all keys prefixed with `foo`, limiting the number of results to 2:
```bash
$ etcdctl get --prefix --limit=2 foo
foo
bar
foo1
bar1
```
## Read past version of keys
Applications may want to read superseded versions of a key. For example, an application may wish to roll back to an old configuration by accessing an earlier version of a key. Alternatively, an application may want a consistent view over multiple keys through multiple requests by accessing key history.
Since every modification to the etcd cluster key-value store increments the global revision of an etcd cluster, an application can read superseded keys by providing an older etcd revision.
Suppose an etcd cluster already has the following keys:
```bash
foo = bar # revision = 2
foo1 = bar1 # revision = 3
foo = bar_new # revision = 4
foo1 = bar1_new # revision = 5
```
Here are an example to access the past versions of keys:
```bash
$ etcdctl get --prefix foo # access the most recent versions of keys
foo
bar_new
foo1
bar1_new
$ etcdctl get --prefix --rev=4 foo # access the versions of keys at revision 4
foo
bar_new
foo1
bar1
$ etcdctl get --prefix --rev=3 foo # access the versions of keys at revision 3
foo
bar
foo1
bar1
$ etcdctl get --prefix --rev=2 foo # access the versions of keys at revision 2
foo
bar
$ etcdctl get --prefix --rev=1 foo # access the versions of keys at revision 1
```
## Read keys which are greater than or equal to the byte value of the specified key
Applications may want to read keys which are greater than or equal to the byte value of the specified key.
Suppose an etcd cluster already has the following keys:
```bash
a = 123
b = 456
z = 789
```
Here is the command to read keys which are greater than or equal to the byte value of key `b` :
```bash
$ etcdctl get --from-key b
b
456
z
789
```
## Delete keys
Applications can delete a key or a range of keys from an etcd cluster.
Suppose an etcd cluster already has the following keys:
```bash
foo = bar
foo1 = bar1
foo3 = bar3
zoo = val
zoo1 = val1
zoo2 = val2
a = 123
b = 456
z = 789
```
Here is the command to delete key `foo`:
```bash
$ etcdctl del foo
1 # one key is deleted
```
Here is the command to delete keys ranging from `foo` to `foo9`:
```bash
$ etcdctl del foo foo9
2 # two keys are deleted
```
Here is the command to delete key `zoo` with the deleted key value pair returned:
```bash
$ etcdctl del --prev-kv zoo
1 # one key is deleted
zoo # deleted key
val # the value of the deleted key
```
Here is the command to delete keys having prefix as `zoo`:
```bash
$ etcdctl del --prefix zoo
2 # two keys are deleted
```
Here is the command to delete keys which are greater than or equal to the byte value of key `b` :
```bash
$ etcdctl del --from-key b
2 # two keys are deleted
```
## Watch key changes
Applications can watch on a key or a range of keys to monitor for any updates.
Here is the command to watch on key `foo`:
```bash
$ etcdctl watch foo
# in another terminal: etcdctl put foo bar
PUT
foo
bar
```
Here is the command to watch on key `foo` in hex format:
```bash
$ etcdctl watch foo --hex
# in another terminal: etcdctl put foo bar
PUT
\x66\x6f\x6f # Key
\x62\x61\x72 # Value
```
Here is the command to watch on a range key from `foo` to `foo9`:
```bash
$ etcdctl watch foo foo9
# in another terminal: etcdctl put foo bar
PUT
foo
bar
# in another terminal: etcdctl put foo1 bar1
PUT
foo1
bar1
```
Here is the command to watch on keys having prefix `foo`:
```bash
$ etcdctl watch --prefix foo
# in another terminal: etcdctl put foo bar
PUT
foo
bar
# in another terminal: etcdctl put fooz1 barz1
PUT
fooz1
barz1
```
Here is the command to watch on multiple keys `foo` and `zoo`:
```bash
$ etcdctl watch -i
$ watch foo
$ watch zoo
# in another terminal: etcdctl put foo bar
PUT
foo
bar
# in another terminal: etcdctl put zoo val
PUT
zoo
val
```
## Watch historical changes of keys
Applications may want to watch for historical changes of keys in etcd. For example, an application may wish to receive all the modifications of a key; if the application stays connected to etcd, then `watch` is good enough. However, if the application or etcd fails, a change may happen during the failure, and the application will not receive the update in real time. To guarantee the update is delivered, the application must be able to watch for historical changes to keys. To do this, an application can specify a historical revision on a watch, just like reading past version of keys.
Suppose we finished the following sequence of operations:
```bash
$ etcdctl put foo bar # revision = 2
OK
$ etcdctl put foo1 bar1 # revision = 3
OK
$ etcdctl put foo bar_new # revision = 4
OK
$ etcdctl put foo1 bar1_new # revision = 5
OK
```
Here is an example to watch the historical changes:
```bash
# watch for changes on key `foo` since revision 2
$ etcdctl watch --rev=2 foo
PUT
foo
bar
PUT
foo
bar_new
```
```bash
# watch for changes on key `foo` since revision 3
$ etcdctl watch --rev=3 foo
PUT
foo
bar_new
```
Here is an example to watch only from the last historical change:
```bash
# watch for changes on key `foo` and return last revision value along with modified value
$ etcdctl watch --prev-kv foo
# in another terminal: etcdctl put foo bar_latest
PUT
foo # key
bar_new # last value of foo key before modification
foo # key
bar_latest # value of foo key after modification
```
## Compacted revisions
As we mentioned, etcd keeps revisions so that applications can read past versions of keys. However, to avoid accumulating an unbounded amount of history, it is important to compact past revisions. After compacting, etcd removes historical revisions, releasing resources for future use. All superseded data with revisions before the compacted revision will be unavailable.
Here is the command to compact the revisions:
```bash
$ etcdctl compact 5
compacted revision 5
# any revisions before the compacted one are not accessible
$ etcdctl get --rev=4 foo
Error: rpc error: code = 11 desc = etcdserver: mvcc: required revision has been compacted
```
Note: The current revision of etcd server can be found using get command on any key (existent or non-existent) in json format. Example is shown below for mykey which does not exist in etcd server:
```bash
$ etcdctl get mykey -w=json
{"header":{"cluster_id":14841639068965178418,"member_id":10276657743932975437,"revision":15,"raft_term":4}}
```
## Grant leases
Applications can grant leases for keys from an etcd cluster. When a key is attached to a lease, its lifetime is bound to the lease's lifetime which in turn is governed by a time-to-live (TTL). Each lease has a minimum time-to-live (TTL) value specified by the application at grant time. The lease's actual TTL value is at least the minimum TTL and is chosen by the etcd cluster. Once a lease's TTL elapses, the lease expires and all attached keys are deleted.
Here is the command to grant a lease:
```bash
# grant a lease with 10 second TTL
$ etcdctl lease grant 10
lease 32695410dcc0ca06 granted with TTL(10s)
# attach key foo to lease 32695410dcc0ca06
$ etcdctl put --lease=32695410dcc0ca06 foo bar
OK
```
## Revoke leases
Applications revoke leases by lease ID. Revoking a lease deletes all of its attached keys.
Suppose we finished the following sequence of operations:
```bash
$ etcdctl lease grant 10
lease 32695410dcc0ca06 granted with TTL(10s)
$ etcdctl put --lease=32695410dcc0ca06 foo bar
OK
```
Here is the command to revoke the same lease:
```bash
$ etcdctl lease revoke 32695410dcc0ca06
lease 32695410dcc0ca06 revoked
$ etcdctl get foo
# empty response since foo is deleted due to lease revocation
```
## Keep leases alive
Applications can keep a lease alive by refreshing its TTL so it does not expire.
Suppose we finished the following sequence of operations:
```bash
$ etcdctl lease grant 10
lease 32695410dcc0ca06 granted with TTL(10s)
```
Here is the command to keep the same lease alive:
```bash
$ etcdctl lease keep-alive 32695410dcc0ca06
lease 32695410dcc0ca06 keepalived with TTL(100)
lease 32695410dcc0ca06 keepalived with TTL(100)
lease 32695410dcc0ca06 keepalived with TTL(100)
...
```
## Get lease information
Applications may want to know about lease information, so that they can be renewed or to check if the lease still exists or it has expired. Applications may also want to know the keys to which a particular lease is attached.
Suppose we finished the following sequence of operations:
```bash
# grant a lease with 500 second TTL
$ etcdctl lease grant 500
lease 694d5765fc71500b granted with TTL(500s)
# attach key zoo1 to lease 694d5765fc71500b
$ etcdctl put zoo1 val1 --lease=694d5765fc71500b
OK
# attach key zoo2 to lease 694d5765fc71500b
$ etcdctl put zoo2 val2 --lease=694d5765fc71500b
OK
```
Here is the command to get information about the lease:
```bash
$ etcdctl lease timetolive 694d5765fc71500b
lease 694d5765fc71500b granted with TTL(500s), remaining(258s)
```
Here is the command to get information about the lease along with the keys attached with the lease:
```bash
$ etcdctl lease timetolive --keys 694d5765fc71500b
lease 694d5765fc71500b granted with TTL(500s), remaining(132s), attached keys([zoo2 zoo1])
# if the lease has expired or does not exist it will give the below response:
Error: etcdserver: requested lease not found
```

View File

@ -0,0 +1,10 @@
# System limits
## Request size limit
etcd is designed to handle small key value pairs typical for metadata. Larger requests will work, but may increase the latency of other requests. For the time being, etcd guarantees to support RPC requests with up to 1MB of data. In the future, the size limit may be loosened or made configurable.
## Storage size limit
The default storage size limit is 2GB, configurable with `--quota-backend-bytes` flag; supports up to 8GB.

View File

@ -0,0 +1,90 @@
# Setup a local cluster
For testing and development deployments, the quickest and easiest way is to set up a local cluster. For a production deployment, refer to the [clustering][clustering] section.
## Local standalone cluster
Deploying an etcd cluster as a standalone cluster is straightforward. Start it with just one command:
```
$ ./etcd
...
```
The started etcd member listens on `localhost:2379` for client requests.
To interact with the started cluster by using etcdctl:
```
# use API version 3
$ export ETCDCTL_API=3
$ ./etcdctl put foo bar
OK
$ ./etcdctl get foo
bar
```
## Local multi-member cluster
A `Procfile` at the base of this git repo is provided to easily set up a local multi-member cluster. To start a multi-member cluster go to the root of an etcd source tree and run:
```
# install goreman program to control Profile-based applications.
$ go get github.com/mattn/goreman
$ goreman -f Procfile start
...
```
The started members listen on `localhost:2379`, `localhost:22379`, and `localhost:32379` for client requests respectively.
To interact with the started cluster by using etcdctl:
```
# use API version 3
$ export ETCDCTL_API=3
$ etcdctl --write-out=table --endpoints=localhost:2379 member list
+------------------+---------+--------+------------------------+------------------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+------------------+---------+--------+------------------------+------------------------+
| 8211f1d0f64f3269 | started | infra1 | http://127.0.0.1:2380 | http://127.0.0.1:2379 |
| 91bc3c398fb3c146 | started | infra2 | http://127.0.0.1:22380 | http://127.0.0.1:22379 |
| fd422379fda50e48 | started | infra3 | http://127.0.0.1:32380 | http://127.0.0.1:32379 |
+------------------+---------+--------+------------------------+------------------------+
$ etcdctl put foo bar
OK
```
To exercise etcd's fault tolerance, kill a member:
```
# kill etcd2
$ goreman run stop etcd2
$ etcdctl put key hello
OK
$ etcdctl get key
hello
# try to get key from the killed member
$ etcdctl --endpoints=localhost:22379 get key
2016/04/18 23:07:35 grpc: Conn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 127.0.0.1:22379: getsockopt: connection refused"; Reconnecting to "localhost:22379"
Error: grpc: timed out trying to connect
# restart the killed member
$ goreman run restart etcd2
# get the key from restarted member
$ etcdctl --endpoints=localhost:22379 get key
hello
```
To learn more about interacting with etcd, read [interacting with etcd section][interacting].
[interacting]: ./interacting_v3.md
[clustering]: ../op-guide/clustering.md

View File

@ -0,0 +1,113 @@
# Discovery service protocol
Discovery service protocol helps new etcd member to discover all other members in cluster bootstrap phase using a shared discovery URL.
Discovery service protocol is _only_ used in cluster bootstrap phase, and cannot be used for runtime reconfiguration or cluster monitoring.
The protocol uses a new discovery token to bootstrap one _unique_ etcd cluster. Remember that one discovery token can represent only one etcd cluster. As long as discovery protocol on this token starts, even if it fails halfway, it must not be used to bootstrap another etcd cluster.
The rest of this article will walk through the discovery process with examples that correspond to a self-hosted discovery cluster. The public discovery service, discovery.etcd.io, functions the same way, but with a layer of polish to abstract away ugly URLs, generate UUIDs automatically, and provide some protections against excessive requests. At its core, the public discovery service still uses an etcd cluster as the data store as described in this document.
## Protocol workflow
The idea of discovery protocol is to use an internal etcd cluster to coordinate bootstrap of a new cluster. First, all new members interact with discovery service and help to generate the expected member list. Then each new member bootstraps its server using this list, which performs the same functionality as -initial-cluster flag.
In the following example workflow, we will list each step of protocol in curl format for ease of understanding.
By convention the etcd discovery protocol uses the key prefix `_etcd/registry`. If `http://example.com` hosts an etcd cluster for discovery service, a full URL to discovery keyspace will be `http://example.com/v2/keys/_etcd/registry`. We will use this as the URL prefix in the example.
### Creating a new discovery token
Generate a unique token that will identify the new cluster. This will be used as a unique prefix in discovery keyspace in the following steps. An easy way to do this is to use `uuidgen`:
```
UUID=$(uuidgen)
```
### Specifying the expected cluster size
The discovery token expects a cluster size that must be specified. The size is used by the discovery service to know when it has found all members that will initially form the cluster.
```
curl -X PUT http://example.com/v2/keys/_etcd/registry/${UUID}/_config/size -d value=${cluster_size}
```
Usually the cluster size is 3, 5 or 7. Check [optimal cluster size][cluster-size] for more details.
### Bringing up etcd processes
Given the discovery URL, use it as `-discovery` flag and bring up etcd processes. Every etcd process will follow this next few steps internally if given a `-discovery` flag.
### Registering itself
The first thing for etcd process is to register itself into the discovery URL as a member. This is done by creating member ID as a key in the discovery URL.
```
curl -X PUT http://example.com/v2/keys/_etcd/registry/${UUID}/${member_id}?prevExist=false -d value="${member_name}=${member_peer_url_1}&${member_name}=${member_peer_url_2}"
```
### Checking the status
It checks the expected cluster size and registration status in discovery URL, and decides what the next action is.
```
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}/_config/size
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}
```
If registered members are still not enough, it will wait for left members to appear.
If the number of registered members is bigger than the expected size N, it treats the first N registered members as the member list for the cluster. If the member itself is in the member list, the discovery procedure succeeds and it fetches all peers through the member list. If it is not in the member list, the discovery procedure finishes with the failure that the cluster has been full.
In etcd implementation, the member may check the cluster status even before registering itself. So it could fail quickly if the cluster has been full.
### Waiting for all members
The wait process is described in detail in the [etcd API documentation][api].
```
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}?wait=true&waitIndex=${current_etcd_index}
```
It keeps waiting until finding all members.
## Public discovery service
CoreOS Inc. hosts a public discovery service at https://discovery.etcd.io/ , which provides some nice features for ease of use.
### Mask key prefix
Public discovery service will redirect `https://discovery.etcd.io/${UUID}` to etcd cluster behind for the key at `/v2/keys/_etcd/registry`. It masks register key prefix for short and readable discovery url.
### Get new token
```
GET /new
Sent query:
size=${cluster_size}
Possible status codes:
200 OK
400 Bad Request
200 Body:
generated discovery url
```
The generation process in the service follows the steps from [Creating a New Discovery Token][new-discovery-token] to [Specifying the Expected Cluster Size][expected-cluster-size].
### Check discovery status
```
GET /${UUID}
```
The status for this discovery token, including the machines that have been registered, can be checked by requesting the value of the UUID.
### Open-source repository
The repository is located at https://github.com/coreos/discovery.etcd.io. It could be used to build a custom discovery service.
[api]: ../v2/api.md#waiting-for-a-change
[cluster-size]: ../v2/admin_guide.md#optimal-cluster-size
[expected-cluster-size]: #specifying-the-expected-cluster-size
[new-discovery-token]: #creating-a-new-discovery-token

View File

@ -0,0 +1,29 @@
# Logging conventions
etcd uses the [capnslog][capnslog] library for logging application output categorized into *levels*. A log message's level is determined according to these conventions:
* Error: Data has been lost, a request has failed for a bad reason, or a required resource has been lost
* Examples:
* A failure to allocate disk space for WAL
* Warning: (Hopefully) Temporary conditions that may cause errors, but may work fine. A replica disappearing (that may reconnect) is a warning.
* Examples:
* Failure to send raft message to a remote peer
* Failure to receive heartbeat message within the configured election timeout
* Notice: Normal, but important (uncommon) log information.
* Examples:
* Add a new node into the cluster
* Add a new user into auth subsystem
* Info: Normal, working log information, everything is fine, but helpful notices for auditing or common operations.
* Examples:
* Startup configuration
* Start to do snapshot
* Debug: Everything is still fine, but even common operations may be logged, and less helpful but more quantity of notices.
* Examples:
* Send a normal message to a remote peer
* Write a log entry to disk
[capnslog]: https://github.com/coreos/pkg/tree/master/capnslog

View File

@ -0,0 +1,118 @@
# etcd release guide
The guide talks about how to release a new version of etcd.
The procedure includes some manual steps for sanity checking, but it can probably be further scripted. Please keep this document up-to-date if making changes to the release process.
## Prepare release
Set desired version as environment variable for following steps. Here is an example to release 2.3.0:
```
export VERSION=v2.3.0
export PREV_VERSION=v2.2.5
```
All releases version numbers follow the format of [semantic versioning 2.0.0](http://semver.org/).
### Major, minor version release, or its pre-release
- Ensure the relevant milestone on GitHub is complete. All referenced issues should be closed, or moved elsewhere.
- Remove this release from [roadmap](https://github.com/coreos/etcd/blob/master/ROADMAP.md), if necessary.
- Ensure the latest upgrade documentation is available.
- Bump [hardcoded MinClusterVerion in the repository](https://github.com/coreos/etcd/blob/master/version/version.go#L29), if necessary.
- Add feature capability maps for the new version, if necessary.
### Patch version release
- Discuss about commits that are backported to the patch release. The commits should not include merge commits.
- Cherry-pick these commits starting from the oldest one into stable branch.
## Write release note
- Write introduction for the new release. For example, what major bug we fix, what new features we introduce or what performance improvement we make.
- Put `[GH XXXX]` at the head of change line to reference Pull Request that introduces the change. Moreover, add a link on it to jump to the Pull Request.
- Find PRs with `release-note` label and explain them in `NEWS` file, as a straightforward summary of changes for end-users.
## Tag version
- Bump [hardcoded Version in the repository](https://github.com/coreos/etcd/blob/master/version/version.go#L30) to the latest version `${VERSION}`.
- Ensure all tests on CI system are passed.
- Manually check etcd is buildable in Linux, Darwin and Windows.
- Manually check upgrade etcd cluster of previous minor version works well.
- Manually check new features work well.
- Add a signed tag through `git tag -s ${VERSION}`.
- Sanity check tag correctness through `git show tags/$VERSION`.
- Push the tag to GitHub through `git push origin tags/$VERSION`. This assumes `origin` corresponds to "https://github.com/coreos/etcd".
## Build release binaries and images
- Ensure `acbuild` is available.
- Ensure `docker` is available.
Run release script in root directory:
```
./scripts/release.sh ${VERSION}
```
It generates all release binaries and images under directory ./release.
## Sign binaries, images, and source code
etcd project key must be used to sign the generated binaries and images.`$SUBKEYID` is the key ID of etcd project Yubikey. Connect the key and run `gpg2 --card-status` to get the ID.
The following commands are used for public release sign:
```
cd release
for i in etcd-*{.zip,.tar.gz}; do gpg2 --default-key $SUBKEYID --armor --output ${i}.asc --detach-sign ${i}; done
for i in etcd-*{.zip,.tar.gz}; do gpg2 --verify ${i}.asc ${i}; done
# sign zipped source code files
wget https://github.com/coreos/etcd/archive/${VERSION}.zip
gpg2 --armor --default-key $SUBKEYID --output ${VERSION}.zip.asc --detach-sign ${VERSION}.zip
gpg2 --verify ${VERSION}.zip.asc ${VERSION}.zip
wget https://github.com/coreos/etcd/archive/${VERSION}.tar.gz
gpg2 --armor --default-key $SUBKEYID --output ${VERSION}.tar.gz.asc --detach-sign ${VERSION}.tar.gz
gpg2 --verify ${VERSION}.tar.gz.asc ${VERSION}.tar.gz
```
The public key for GPG signing can be found at [CoreOS Application Signing Key](https://coreos.com/security/app-signing-key)
## Publish release page in GitHub
- Set release title as the version name.
- Follow the format of previous release pages.
- Attach the generated binaries, aci image and signatures.
- Select whether it is a pre-release.
- Publish the release!
## Publish docker image in Quay.io
- Push docker image:
```
docker login quay.io
docker push quay.io/coreos/etcd:${VERSION}
```
- Add `latest` tag to the new image on [quay.io](https://quay.io/repository/coreos/etcd?tag=latest&tab=tags) if this is a stable release.
## Announce to the etcd-dev Googlegroup
- Follow the format of [previous release emails](https://groups.google.com/forum/#!forum/etcd-dev).
- Make sure to include a list of authors that contributed since the previous release - something like the following might be handy:
```
git log ...${PREV_VERSION} --pretty=format:"%an" | sort | uniq | tr '\n' ',' | sed -e 's#,#, #g' -e 's#, $##'
```
- Send email to etcd-dev@googlegroups.com
## Post release
- Create new stable branch through `git push origin ${VERSION_MAJOR}.${VERSION_MINOR}` if this is a major stable release. This assumes `origin` corresponds to "https://github.com/coreos/etcd".
- Bump [hardcoded Version in the repository](https://github.com/coreos/etcd/blob/master/version/version.go#L30) to the version `${VERSION}+git`.

67
Documentation/dl_build.md Normal file
View File

@ -0,0 +1,67 @@
# Download and build
## System requirements
The etcd performance benchmarks run etcd on 8 vCPU, 16GB RAM, 50GB SSD GCE instances, but any relatively modern machine with low latency storage and a few gigabytes of memory should suffice for most use cases. Applications with large v2 data stores will require more memory than a large v3 data store since data is kept in anonymous memory instead of memory mapped from a file. than For running etcd on a cloud provider, we suggest at least a medium instance on AWS or a standard-1 instance on GCE.
## Download the pre-built binary
The easiest way to get etcd is to use one of the pre-built release binaries which are available for OSX, Linux, Windows, appc, and Docker. Instructions for using these binaries are on the [GitHub releases page][github-release].
## Build the latest version
For those wanting to try the very latest version, build etcd from the `master` branch. [Go](https://golang.org/) version 1.8+ is required to build the latest version of etcd. To ensure etcd is built against well-tested libraries, etcd vendors its dependencies for official release binaries. However, etcd's vendoring is also optional to avoid potential import conflicts when embedding the etcd server or using the etcd client.
To build `etcd` from the `master` branch without a `GOPATH` using the official `build` script:
```sh
$ git clone https://github.com/coreos/etcd.git
$ cd etcd
$ ./build
$ ./bin/etcd
```
To build a vendored `etcd` from the `master` branch via `go get`:
```sh
# GOPATH should be set
$ echo $GOPATH
/Users/example/go
$ go get github.com/coreos/etcd/cmd/etcd
$ $GOPATH/bin/etcd
```
To build `etcd` from the `master` branch without vendoring (may not build due to upstream conflicts):
```sh
# GOPATH should be set
$ echo $GOPATH
/Users/example/go
$ go get github.com/coreos/etcd
$ $GOPATH/bin/etcd
```
## Test the installation
Check the etcd binary is built correctly by starting etcd and setting a key.
Start etcd:
```
$ ./bin/etcd
```
Set a key:
```
$ ETCDCTL_API=3 ./bin/etcdctl put foo bar
OK
```
If OK is printed, then etcd is working!
[github-release]: https://github.com/coreos/etcd/releases/
[go]: https://golang.org/doc/install
[build-script]: ../build
[cmd-directory]: ../cmd

113
Documentation/docs.md Normal file
View File

@ -0,0 +1,113 @@
# Documentation
etcd is a distributed key-value store designed to reliably and quickly preserve and provide access to critical data. It enables reliable distributed coordination through distributed locking, leader elections, and write barriers. An etcd cluster is intended for high availability and permanent data storage and retrieval.
## Getting started
New etcd users and developers should get started by [downloading and building][download_build] etcd. After getting etcd, follow this [quick demo][demo] to see the basics of creating and working with an etcd cluster.
## Developing with etcd
The easiest way to get started using etcd as a distributed key-value store is to [set up a local cluster][local_cluster].
- [Setting up local clusters][local_cluster]
- [Interacting with etcd][interacting]
- gRPC [etcd core][api_ref] and [etcd concurrency][api_concurrency_ref] API references
- [HTTP JSON API through the gRPC gateway][api_grpc_gateway]
- [gRPC naming and discovery][grpc_naming]
- [Client][namespace_client] and [proxy][namespace_proxy] namespacing
- [Embedding etcd][embed_etcd]
- [Experimental features and APIs][experimental]
- [System limits][system-limit]
## Operating etcd clusters
Administrators who need to create reliable and scalable key-value stores for the developers they support should begin with a [cluster on multiple machines][clustering].
- [Setting up etcd clusters][clustering]
- [Setting up etcd gateways][gateway]
- [Setting up etcd gRPC proxy][grpc_proxy]
- [Hardware recommendations][hardware]
- [Configuration][conf]
- [Security][security]
- [Authentication][authentication]
- [Monitoring][monitoring]
- [Maintenance][maintenance]
- [Understand failures][failures]
- [Disaster recovery][recovery]
- [Performance][performance]
- [Versioning][versioning]
### Platform guides
- [Supported systems][supported_platforms]
- [Docker container][container_docker]
- [Container Linux, systemd][container_linux_platform]
- [rkt container][container_rkt]
- [Amazon Web Services][aws_platform]
- [FreeBSD][freebsd_platform]
### Upgrading and compatibility
- [Migrate applications from using API v2 to API v3][v2_migration]
- [Upgrading a v2.3 cluster to v3.0][v3_upgrade]
- [Upgrading a v3.0 cluster to v3.1][v31_upgrade]
- [Upgrading a v3.1 cluster to v3.2][v32_upgrade]
## Learning
To learn more about the concepts and internals behind etcd, read the following pages:
- [Why etcd?][why]
- [Understand data model][data_model]
- [Understand APIs][understand_apis]
- [Glossary][glossary]
- Internals
- [Auth subsystem][auth_design]
## Frequently Asked Questions (FAQ)
Answers to [common questions] about etcd.
[api_ref]: dev-guide/api_reference_v3.md
[api_concurrency_ref]: dev-guide/api_concurrency_reference_v3.md
[api_grpc_gateway]: dev-guide/api_grpc_gateway.md
[clustering]: op-guide/clustering.md
[conf]: op-guide/configuration.md
[system-limit]: dev-guide/limit.md
[common questions]: faq.md
[why]: learning/why.md
[data_model]: learning/data_model.md
[demo]: demo.md
[download_build]: dl_build.md
[embed_etcd]: https://godoc.org/github.com/coreos/etcd/embed
[grpc_naming]: dev-guide/grpc_naming.md
[failures]: op-guide/failures.md
[gateway]: op-guide/gateway.md
[glossary]: learning/glossary.md
[namespace_client]: https://godoc.org/github.com/coreos/etcd/clientv3/namespace
[namespace_proxy]: op-guide/grpc_proxy.md#namespacing
[grpc_proxy]: op-guide/grpc_proxy.md
[hardware]: op-guide/hardware.md
[interacting]: dev-guide/interacting_v3.md
[local_cluster]: dev-guide/local_cluster.md
[performance]: op-guide/performance.md
[recovery]: op-guide/recovery.md
[maintenance]: op-guide/maintenance.md
[security]: op-guide/security.md
[monitoring]: op-guide/monitoring.md
[v2_migration]: op-guide/v2-migration.md
[container_rkt]: op-guide/container.md#rkt
[container_docker]: op-guide/container.md#docker
[understand_apis]: learning/api.md
[versioning]: op-guide/versioning.md
[supported_platforms]: op-guide/supported-platform.md
[container_linux_platform]: platforms/container-linux-systemd.md
[freebsd_platform]: platforms/freebsd.md
[aws_platform]: platforms/aws.md
[experimental]: dev-guide/experimental_apis.md
[v3_upgrade]: upgrades/upgrade_3_0.md
[v31_upgrade]: upgrades/upgrade_3_1.md
[v32_upgrade]: upgrades/upgrade_3_2.md
[authentication]: op-guide/authentication.md
[auth_design]: learning/auth_design.md

View File

@ -1,83 +1,151 @@
# FAQ
## 1) How come I can read an old version of the data when a majority of the members are down?
## Frequently Asked Questions (FAQ)
In situations where a client connects to a minority, etcd
favors by default availability over consistency. This means that even though
data might be “out of date”, it is still better to return something versus
nothing.
### etcd, general
In order to confirm that a read is up to date with a majority of the cluster,
the client can use the `quorum=true` parameter on reads of keys. This means
that a majority of the cluster is checked on reads before returning the data,
otherwise the read will timeout and fail.
#### Do clients have to send requests to the etcd leader?
## 2) With quorum=false, doesnt this mean that if my client switched the member it was connected to, that it could experience a logical ordering where the cluster goes backwards in time?
[Raft][raft] is leader-based; the leader handles all client requests which need cluster consensus. However, the client does not need to know which node is the leader. Any request that requires consensus sent to a follower is automatically forwarded to the leader. Requests that do not require consensus (e.g., serialized reads) can be processed by any cluster member.
Yes, but this could be handled at the etcd client implementation via
remembering the last seen index. The “index” is the cluster's single
irrevocable sequence of the entire modification history. The client could
remember the last seen index, and determine via comparing the index returned on
the GET whether or not the state of the key-value pair is before or after its
last seen state.
### Configuration
## 3) What happens if a watch is registered on a minority member?
#### What is the difference between advertise-urls and listen-urls?
The watch will stay untriggered, even as modifications are occurring in the
majority quorum. This is an open issue, and is being addressed in v3. There are
multiple ways to work around the watch trigger not firing.
`listen-urls` specifies the local addresses etcd server binds to for accepting incoming connections. To listen on a port for all interfaces, specify `0.0.0.0` as the listen IP address.
1) build a signaling mechanism independent of etcd. This could be as simple as
a “pulse” to the client to reissue a GET with quorum=true for the most recent
version of the data.
2) poll on the `/v2/keys` endpoint and check that the raft-index is increasing every
timeout.
`advertise-urls` specifies the addresses etcd clients or other etcd members should use to contact the etcd server. The advertise addresses must be reachable from the remote machines. Do not advertise addresses like `localhost` or `0.0.0.0` for a production setup since these addresses are unreachable from remote machines.
## 4) What is a proxy used for?
### Deployment
A proxy is a redirection server to the etcd cluster. The proxy handles the
redirection of a client to the current configuration of the etcd cluster. A
typical use case is to start a proxy on a machine, and on first boot up of the
proxy specify both the `--proxy` flag and the `--initial-cluster` flag.
#### System requirements
From there, any etcdctl client that starts up automatically speaks to the local
proxy and the proxy redirects operations to the current configuration of the
cluster it was originally paired with.
Since etcd writes data to disk, SSD is highly recommended. To prevent performance degradation or unintentionally overloading the key-value store, etcd enforces a 2GB default storage size quota, configurable up to 8GB. To avoid swapping or running out of memory, the machine should have at least as much RAM to cover the quota. At CoreOS, an etcd cluster is usually deployed on dedicated CoreOS Container Linux machines with dual-core processors, 2GB of RAM, and 80GB of SSD *at the very least*. **Note that performance is intrinsically workload dependent; please test before production deployment**. See [hardware][hardware-setup] for more recommendations.
In the v2 spec of etcd, proxies cannot be promoted to members of the cluster.
They also cannot be promoted to followers or at any point become part of the
replication of the etcd cluster itself.
Most stable production environment is Linux operating system with amd64 architecture; see [supported platform][supported-platform] for more.
## 5) How is cluster membership and health handled in etcd v2?
#### Why an odd number of cluster members?
The design goal of etcd is that reconfiguration is simply an API, and health
monitoring and addition/removal of members is up to the individual application
and their integration with the reconfiguration API.
An etcd cluster needs a majority of nodes, a quorum, to agree on updates to the cluster state. For a cluster with n members, quorum is (n/2)+1. For any odd-sized cluster, adding one node will always increase the number of nodes necessary for quorum. Although adding a node to an odd-sized cluster appears better since there are more machines, the fault tolerance is worse since exactly the same number of nodes may fail without losing quorum but there are more nodes that can fail. If the cluster is in a state where it can't tolerate any more failures, adding a node before removing nodes is dangerous because if the new node fails to register with the cluster (e.g., the address is misconfigured), quorum will be permanently lost.
Thus, a member that is down, even infinitely, will never be automatically
removed from the etcd cluster member list.
#### What is maximum cluster size?
This makes sense because it's usually an application level / administrative
action to determine whether a reconfiguration should happen based on health.
Theoretically, there is no hard limit. However, an etcd cluster probably should have no more than seven nodes. [Google Chubby lock service][chubby], similar to etcd and widely deployed within Google for many years, suggests running five nodes. A 5-member etcd cluster can tolerate two member failures, which is enough in most cases. Although larger clusters provide better fault tolerance, the write performance suffers because data must be replicated across more machines.
For more information, refer to the [runtime reconfiguration design document][runtime-reconf-design].
#### What is failure tolerance?
## 6) how does --endpoint work with etcdctl?
An etcd cluster operates so long as a member quorum can be established. If quorum is lost through transient network failures (e.g., partitions), etcd automatically and safely resumes once the network recovers and restores quorum; Raft enforces cluster consistency. For power loss, etcd persists the Raft log to disk; etcd replays the log to the point of failure and resumes cluster participation. For permanent hardware failure, the node may be removed from the cluster through [runtime reconfiguration][runtime reconfiguration].
The `--endpoint` flag can specify any number of etcd cluster members in a comma
separated list. This list might be a subset, equal to, or more than the actual
etcd cluster member list itself.
It is recommended to have an odd number of members in a cluster. An odd-size cluster tolerates the same number of failures as an even-size cluster but with fewer nodes. The difference can be seen by comparing even and odd sized clusters:
If only one peer is specified via the `--endpoint` flag, the etcdctl discovers the
rest of the cluster via the member list of that one peer, and then it randomly
chooses a member to use. Again, the client can use the `quorum=true` flag on
reads, which will always fail when using a member in the minority.
| Cluster Size | Majority | Failure Tolerance |
|:-:|:-:|:-:|
| 1 | 1 | 0 |
| 2 | 2 | 0 |
| 3 | 2 | 1 |
| 4 | 3 | 1 |
| 5 | 3 | 2 |
| 6 | 4 | 2 |
| 7 | 4 | 3 |
| 8 | 5 | 3 |
| 9 | 5 | 4 |
If peers from multiple clusters are specified via the `--endpoint` flag, etcdctl
will randomly choose a peer, and the request will simply get routed to one of
the clusters. This is probably not what you want.
Adding a member to bring the size of cluster up to an even number doesn't buy additional fault tolerance. Likewise, during a network partition, an odd number of members guarantees that there will always be a majority partition that can continue to operate and be the source of truth when the partition ends.
Note: --peers flag is now deprecated and --endpoint should be used instead,
as it might confuse users to give etcdctl a peerURL.
#### Does etcd work in cross-region or cross data center deployments?
[runtime-reconf-design]: runtime-reconf-design.md
Deploying etcd across regions improves etcd's fault tolerance since members are in separate failure domains. The cost is higher consensus request latency from crossing data center boundaries. Since etcd relies on a member quorum for consensus, the latency from crossing data centers will be somewhat pronounced because at least a majority of cluster members must respond to consensus requests. Additionally, cluster data must be replicated across all peers, so there will be bandwidth cost as well.
With longer latencies, the default etcd configuration may cause frequent elections or heartbeat timeouts. See [tuning] for adjusting timeouts for high latency deployments.
### Operation
#### How to backup a etcd cluster?
etcdctl provides a `snapshot` command to create backups. See [backup][backup] for more details.
#### Should I add a member before removing an unhealthy member?
When replacing an etcd node, it's important to remove the member first and then add its replacement.
etcd employs distributed consensus based on a quorum model; (n+1)/2 members, a majority, must agree on a proposal before it can be committed to the cluster. These proposals include key-value updates and membership changes. This model totally avoids any possibility of split brain inconsistency. The downside is permanent quorum loss is catastrophic.
How this applies to membership: If a 3-member cluster has 1 downed member, it can still make forward progress because the quorum is 2 and 2 members are still live. However, adding a new member to a 3-member cluster will increase the quorum to 3 because 3 votes are required for a majority of 4 members. Since the quorum increased, this extra member buys nothing in terms of fault tolerance; the cluster is still one node failure away from being unrecoverable.
Additionally, that new member is risky because it may turn out to be misconfigured or incapable of joining the cluster. In that case, there's no way to recover quorum because the cluster has two members down and two members up, but needs three votes to change membership to undo the botched membership addition. etcd will by default reject member add attempts that could take down the cluster in this manner.
On the other hand, if the downed member is removed from cluster membership first, the number of members becomes 2 and the quorum remains at 2. Following that removal by adding a new member will also keep the quorum steady at 2. So, even if the new node can't be brought up, it's still possible to remove the new member through quorum on the remaining live members.
#### Why won't etcd accept my membership changes?
etcd sets `strict-reconfig-check` in order to reject reconfiguration requests that would cause quorum loss. Abandoning quorum is really risky (especially when the cluster is already unhealthy). Although it may be tempting to disable quorum checking if there's quorum loss to add a new member, this could lead to full fledged cluster inconsistency. For many applications, this will make the problem even worse ("disk geometry corruption" being a candidate for most terrifying).
#### Why does etcd lose its leader from disk latency spikes?
This is intentional; disk latency is part of leader liveness. Suppose the cluster leader takes a minute to fsync a raft log update to disk, but the etcd cluster has a one second election timeout. Even though the leader can process network messages within the election interval (e.g., send heartbeats), it's effectively unavailable because it can't commit any new proposals; it's waiting on the slow disk. If the cluster frequently loses its leader due to disk latencies, try [tuning][tuning] the disk settings or etcd time parameters.
#### What does the etcd warning "request ignored (cluster ID mismatch)" mean?
Every new etcd cluster generates a new cluster ID based on the initial cluster configuration and a user-provided unique `initial-cluster-token` value. By having unique cluster ID's, etcd is protected from cross-cluster interaction which could corrupt the cluster.
Usually this warning happens after tearing down an old cluster, then reusing some of the peer addresses for the new cluster. If any etcd process from the old cluster is still running it will try to contact the new cluster. The new cluster will recognize a cluster ID mismatch, then ignore the request and emit this warning. This warning is often cleared by ensuring peer addresses among distinct clusters are disjoint.
#### What does "mvcc: database space exceeded" mean and how do I fix it?
The [multi-version concurrency control][api-mvcc] data model in etcd keeps an exact history of the keyspace. Without periodically compacting this history (e.g., by setting `--auto-compaction`), etcd will eventually exhaust its storage space. If etcd runs low on storage space, it raises a space quota alarm to protect the cluster from further writes. So long as the alarm is raised, etcd responds to write requests with the error `mvcc: database space exceeded`.
To recover from the low space quota alarm:
1. [Compact][maintenance-compact] etcd's history.
2. [Defragment][maintenance-defragment] every etcd endpoint.
3. [Disarm][maintenance-disarm] the alarm.
### Performance
#### How should I benchmark etcd?
Try the [benchmark] tool. Current [benchmark results][benchmark-result] are available for comparison.
#### What does the etcd warning "apply entries took too long" mean?
After a majority of etcd members agree to commit a request, each etcd server applies the request to its data store and persists the result to disk. Even with a slow mechanical disk or a virtualized network disk, such as Amazons EBS or Googles PD, applying a request should normally take fewer than 50 milliseconds. If the average apply duration exceeds 100 milliseconds, etcd will warn that entries are taking too long to apply.
Usually this issue is caused by a slow disk. The disk could be experiencing contention among etcd and other applications, or the disk is too simply slow (e.g., a shared virtualized disk). To rule out a slow disk from causing this warning, monitor [backend_commit_duration_seconds][backend_commit_metrics] (p99 duration should be less than 25ms) to confirm the disk is reasonably fast. If the disk is too slow, assigning a dedicated disk to etcd or using faster disk will typically solve the problem.
The second most common cause is CPU starvation. If monitoring of the machines CPU usage shows heavy utilization, there may not be enough compute capacity for etcd. Moving etcd to dedicated machine, increasing process resource isolation cgroups, or renicing the etcd server process into a higher priority can usually solve the problem.
Expensive user requests which access too many keys (e.g., fetching the entire keyspace) can also cause long apply latencies. Accessing fewer than a several hundred keys per request, however, should always be performant.
If none of the above suggestions clear the warnings, please [open an issue][new_issue] with detailed logging, monitoring, metrics and optionally workload information.
#### What does the etcd warning "failed to send out heartbeat on time" mean?
etcd uses a leader-based consensus protocol for consistent data replication and log execution. Cluster members elect a single leader, all other members become followers. The elected leader must periodically send heartbeats to its followers to maintain its leadership. Followers infer leader failure if no heartbeats are received within an election interval and trigger an election. If a leader doesnt send its heartbeats in time but is still running, the election is spurious and likely caused by insufficient resources. To catch these soft failures, if the leader skips two heartbeat intervals, etcd will warn it failed to send a heartbeat on time.
Usually this issue is caused by a slow disk. Before the leader sends heartbeats attached with metadata, it may need to persist the metadata to disk. The disk could be experiencing contention among etcd and other applications, or the disk is too simply slow (e.g., a shared virtualized disk). To rule out a slow disk from causing this warning, monitor [wal_fsync_duration_seconds][wal_fsync_duration_seconds] (p99 duration should be less than 10ms) to confirm the disk is reasonably fast. If the disk is too slow, assigning a dedicated disk to etcd or using faster disk will typically solve the problem.
The second most common cause is CPU starvation. If monitoring of the machines CPU usage shows heavy utilization, there may not be enough compute capacity for etcd. Moving etcd to dedicated machine, increasing process resource isolation with cgroups, or renicing the etcd server process into a higher priority can usually solve the problem.
A slow network can also cause this issue. If network metrics among the etcd machines shows long latencies or high drop rate, there may not be enough network capacity for etcd. Moving etcd members to a less congested network will typically solve the problem. However, if the etcd cluster is deployed across data centers, long latency between members is expected. For such deployments, tune the `heartbeat-interval` configuration to roughly match the round trip time between the machines, and the `election-timeout` configuration to be at least 5 * `heartbeat-interval`. See [tuning documentation][tuning] for detailed information.
If none of the above suggestions clear the warnings, please [open an issue][new_issue] with detailed logging, monitoring, metrics and optionally workload information.
#### What does the etcd warning "snapshotting is taking more than x seconds to finish ..." mean?
etcd sends a snapshot of its complete key-value store to refresh slow followers and for [backups][backup]. Slow snapshot transfer times increase MTTR; if the cluster is ingesting data with high throughput, slow followers may livelock by needing a new snapshot before finishing receiving a snapshot. To catch slow snapshot performance, etcd warns when sending a snapshot takes more than thirty seconds and exceeds the expected transfer time for a 1Gbps connection.
[hardware-setup]: ./op-guide/hardware.md
[supported-platform]: ./op-guide/supported-platform.md
[wal_fsync_duration_seconds]: ./metrics.md#disk
[tuning]: ./tuning.md
[new_issue]: https://github.com/coreos/etcd/issues/new
[backend_commit_metrics]: ./metrics.md#disk
[raft]: https://raft.github.io/raft.pdf
[backup]: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/recovery.md#snapshotting-the-keyspace
[chubby]: http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf
[runtime reconfiguration]: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/runtime-configuration.md
[benchmark]: https://github.com/coreos/etcd/tree/master/tools/benchmark
[benchmark-result]: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/performance.md
[api-mvcc]: learning/api.md#revisions
[maintenance-compact]: op-guide/maintenance.md#history-compaction
[maintenance-defragment]: op-guide/maintenance.md#defragmentation
[maintenance-disarm]: ../etcdctl/README.md#alarm-disarm

View File

@ -0,0 +1,158 @@
# Libraries and tools
**Tools**
- [etcdctl](https://github.com/coreos/etcd/tree/master/etcdctl) - A command line client for etcd
- [etcd-backup](https://github.com/fanhattan/etcd-backup) - A powerful command line utility for dumping/restoring etcd - Supports v2
- [etcd-dump](https://npmjs.org/package/etcd-dump) - Command line utility for dumping/restoring etcd.
- [etcd-fs](https://github.com/xetorthio/etcd-fs) - FUSE filesystem for etcd
- [etcddir](https://github.com/rekby/etcddir) - Realtime sync etcd and local directory. Work with windows and linux.
- [etcd-browser](https://github.com/henszey/etcd-browser) - A web-based key/value editor for etcd using AngularJS
- [etcd-lock](https://github.com/datawisesystems/etcd-lock) - Master election & distributed r/w lock implementation using etcd - Supports v2
- [etcd-console](https://github.com/matishsiao/etcd-console) - A web-base key/value editor for etcd using PHP
- [etcd-viewer](https://github.com/nikfoundas/etcd-viewer) - An etcd key-value store editor/viewer written in Java
- [etcdtool](https://github.com/mickep76/etcdtool) - Export/Import/Edit etcd directory as JSON/YAML/TOML and Validate directory using JSON schema
- [etcd-rest](https://github.com/mickep76/etcd-rest) - Create generic REST API in Go using etcd as a backend with validation using JSON schema
- [etcdsh](https://github.com/kamilhark/etcdsh) - A command line client with support of command history and tab completion. Supports v2
- [etcdloadtest](https://github.com/sinsharat/etcdloadtest) - A command line load test client for etcd version 3.0 and above.
**Go libraries**
- [etcd/clientv3](https://github.com/coreos/etcd/blob/master/clientv3) - the officially maintained Go client for v3
- [etcd/client](https://github.com/coreos/etcd/blob/master/client) - the officially maintained Go client for v2
- [go-etcd](https://github.com/coreos/go-etcd) - the deprecated official client. May be useful for older (<2.0.0) versions of etcd.
- [encWrapper](https://github.com/lumjjb/etcd/tree/enc_wrapper/clientwrap/encwrapper) - encWrapper is an encryption wrapper for the etcd client Keys API/KV.
**Java libraries**
- [coreos/jetcd](https://github.com/coreos/jetcd) - Supports v3
- [boonproject/etcd](https://github.com/boonproject/boon/blob/master/etcd/README.md) - Supports v2, Async/Sync and waits
- [justinsb/jetcd](https://github.com/justinsb/jetcd)
- [diwakergupta/jetcd](https://github.com/diwakergupta/jetcd) - Supports v2
- [jurmous/etcd4j](https://github.com/jurmous/etcd4j) - Supports v2, Async/Sync, waits and SSL
- [AdoHe/etcd4j](http://github.com/AdoHe/etcd4j) - Supports v2 (enhance for real production cluster)
- [cdancy/etcd-rest](https://github.com/cdancy/etcd-rest) - Uses jclouds to provide a complete implementation of v2 API.
**Scala libraries**
- [maciej/etcd-client](https://github.com/maciej/etcd-client) - Supports v2. Akka HTTP-based fully async client
- [eiipii/etcdhttpclient](https://bitbucket.org/eiipii/etcdhttpclient) - Supports v2. Async HTTP client based on Netty and Scala Futures.
**Python libraries**
- [kragniz/python-etcd3](https://github.com/kragniz/python-etcd3) - Work in progress client for v3
- [jplana/python-etcd](https://github.com/jplana/python-etcd) - Supports v2
- [russellhaering/txetcd](https://github.com/russellhaering/txetcd) - a Twisted Python library
- [cholcombe973/autodock](https://github.com/cholcombe973/autodock) - A docker deployment automation tool
- [lisael/aioetcd](https://github.com/lisael/aioetcd) - (Python 3.4+) Asyncio coroutines client (Supports v2)
- [txaio-etcd](https://github.com/crossbario/txaio-etcd) - Asynchronous etcd v3-only client library for Twisted (today) and asyncio (future)
- [dims/etcd3-gateway](https://github.com/dims/etcd3-gateway) - etcd v3 API library using the HTTP grpc gateway
**Node libraries**
- [stianeikeland/node-etcd](https://github.com/stianeikeland/node-etcd) - Supports v2 (w Coffeescript)
- [lavagetto/nodejs-etcd](https://github.com/lavagetto/nodejs-etcd) - Supports v2
- [deedubs/node-etcd-config](https://github.com/deedubs/node-etcd-config) - Supports v2
**Ruby libraries**
- [iconara/etcd-rb](https://github.com/iconara/etcd-rb)
- [jpfuentes2/etcd-ruby](https://github.com/jpfuentes2/etcd-ruby)
- [ranjib/etcd-ruby](https://github.com/ranjib/etcd-ruby) - Supports v2
- [davissp14/etcdv3-ruby](https://github.com/davissp14/etcdv3-ruby) - Supports v3
**C libraries**
- [apache/celix/etcdlib](https://github.com/apache/celix/tree/develop/etcdlib) - Supports v2
- [jdarcy/etcd-api](https://github.com/jdarcy/etcd-api) - Supports v2
- [shafreeck/cetcd](https://github.com/shafreeck/cetcd) - Supports v2
**C++ libraries**
- [edwardcapriolo/etcdcpp](https://github.com/edwardcapriolo/etcdcpp) - Supports v2
- [suryanathan/etcdcpp](https://github.com/suryanathan/etcdcpp) - Supports v2 (with waits)
- [nokia/etcd-cpp-api](https://github.com/nokia/etcd-cpp-api) - Supports v2
- [nokia/etcd-cpp-apiv3](https://github.com/nokia/etcd-cpp-apiv3) - Supports v3
**Clojure libraries**
- [aterreno/etcd-clojure](https://github.com/aterreno/etcd-clojure)
- [dwwoelfel/cetcd](https://github.com/dwwoelfel/cetcd) - Supports v2
- [rthomas/clj-etcd](https://github.com/rthomas/clj-etcd) - Supports v2
**Erlang libraries**
- [marshall-lee/etcd.erl](https://github.com/marshall-lee/etcd.erl)
**.Net Libraries**
- [wangjia184/etcdnet](https://github.com/wangjia184/etcdnet) - Supports v2
- [drusellers/etcetera](https://github.com/drusellers/etcetera)
**PHP Libraries**
- [linkorb/etcd-php](https://github.com/linkorb/etcd-php)
- [activecollab/etcd](https://github.com/activecollab/etcd)
**Haskell libraries**
- [wereHamster/etcd-hs](https://github.com/wereHamster/etcd-hs)
**R libraries**
- [ropensci/etseed](https://github.com/ropensci/etseed)
**Nim libraries**
- [etcd_client](https://github.com/FedericoCeratto/nim-etcd-client)
**Tcl libraries**
- [efrecon/etcd-tcl](https://github.com/efrecon/etcd-tcl) - Supports v2, except wait.
**Rust libraries**
- [jimmycuadra/rust-etcd](https://github.com/jimmycuadra/rust-etcd) - Supports v2
**Gradle Plugins**
- [gradle-etcd-rest-plugin](https://github.com/cdancy/gradle-etcd-rest-plugin) - Supports v2
**Chef Integration**
- [coderanger/etcd-chef](https://github.com/coderanger/etcd-chef)
**Chef Cookbook**
- [spheromak/etcd-cookbook](https://github.com/spheromak/etcd-cookbook)
**BOSH Releases**
- [cloudfoundry-community/etcd-boshrelease](https://github.com/cloudfoundry-community/etcd-boshrelease)
- [cloudfoundry/cf-release](https://github.com/cloudfoundry/cf-release/tree/master/jobs/etcd)
**Projects using etcd**
- [apache/celix](https://github.com/apache/celix) - an implementation of the OSGi specification adapted to C and C++
- [binocarlos/yoda](https://github.com/binocarlos/yoda) - etcd + ZeroMQ
- [blox/blox](https://github.com/blox/blox) - a collection of open source projects for container management and orchestration with AWS ECS
- [calavera/active-proxy](https://github.com/calavera/active-proxy) - HTTP Proxy configured with etcd
- [chain/chain](https://github.com/chain/chain) - software designed to operate and connect to highly scalable permissioned blockchain networks
- [derekchiang/etcdplus](https://github.com/derekchiang/etcdplus) - A set of distributed synchronization primitives built upon etcd
- [go-discover](https://github.com/flynn/go-discover) - service discovery in Go
- [gleicon/goreman](https://github.com/gleicon/goreman/tree/etcd) - Branch of the Go Foreman clone with etcd support
- [garethr/hiera-etcd](https://github.com/garethr/hiera-etcd) - Puppet hiera backend using etcd
- [mattn/etcd-vim](https://github.com/mattn/etcd-vim) - SET and GET keys from inside vim
- [mattn/etcdenv](https://github.com/mattn/etcdenv) - "env" shebang with etcd integration
- [kelseyhightower/confd](https://github.com/kelseyhightower/confd) - Manage local app config files using templates and data from etcd
- [configdb](https://git.autistici.org/ai/configdb/tree/master) - A REST relational abstraction on top of arbitrary database backends, aimed at storing configs and inventories.
- [fleet](https://github.com/coreos/fleet) - Distributed init system
- [kubernetes/kubernetes](https://github.com/kubernetes/kubernetes) - Container cluster manager introduced by Google.
- [mailgun/vulcand](https://github.com/mailgun/vulcand) - HTTP proxy that uses etcd as a configuration backend.
- [duedil-ltd/discodns](https://github.com/duedil-ltd/discodns) - Simple DNS nameserver using etcd as a database for names and records.
- [skynetservices/skydns](https://github.com/skynetservices/skydns) - RFC compliant DNS server
- [xordataexchange/crypt](https://github.com/xordataexchange/crypt) - Securely store values in etcd using GPG encryption
- [spf13/viper](https://github.com/spf13/viper) - Go configuration library, reads values from ENV, pflags, files, and etcd with optional encryption
- [lytics/metafora](https://github.com/lytics/metafora) - Go distributed task library
- [ryandoyle/nss-etcd](https://github.com/ryandoyle/nss-etcd) - A GNU libc NSS module for resolving names from etcd.
- [Gru](https://github.com/dnaeon/gru) - Orchestration made easy with Go
- [Vitess](http://vitess.io/) - Vitess is a database clustering system for horizontal scaling of MySQL.

View File

@ -0,0 +1,481 @@
# etcd3 API
This document is meant to give an overview of the etcd3 API's central design. It is by no means all encompassing, but intended to focus on the basic ideas needed to understand etcd without the distraction of less common API calls. All etcd3 API's are defined in [gRPC services][grpc-service], which categorize remote procedure calls (RPCs) understood by the etcd server. A full listing of all etcd RPCs are documented in markdown in the [gRPC API listing][grpc-api].
## gRPC Services
Every API request sent to an etcd server is a gRPC remote procedure call. RPCs in etcd3 are categorized based on functionality into services.
Services important for dealing with etcd's key space include:
* KV - Creates, updates, fetches, and deletes key-value pairs.
* Watch - Monitors changes to keys.
* Lease - Primitives for consuming client keep-alive messages.
Services which manage the cluster itself include:
* Auth - Role based authentication mechanism for authenticating users.
* Cluster - Provides membership information and configuration facilities.
* Maintenance - Takes recovery snapshots, defragments the store, and returns per-member status information.
### Requests and Responses
All RPCs in etcd3 follow the same format. Each RPC has a function `Name` which takes `NameRequest` as an argument and returns `NameResponse` as a response. For example, here is the `Range` RPC description:
```protobuf
service KV {
Range(RangeRequest) returns (RangeResponse)
...
}
```
### Response header
All Responses from etcd API have an attached response header which includes cluster metadata for the response:
```proto
message ResponseHeader {
uint64 cluster_id = 1;
uint64 member_id = 2;
int64 revision = 3;
uint64 raft_term = 4;
}
```
* Cluster_ID - the ID of the cluster generating the response.
* Member_ID - the ID of the member generating the response.
* Revision - the revision of the key-value store when generating the response.
* Raft_Term - the Raft term of the member when generating the response.
An application may read the Cluster_ID (Member_ID) field to ensure it is communicating with the intended cluster (member).
Applications can use the `Revision` to know the latest revision of the key-value store. This is especially useful when applications specify a historical revision to make time `travel query` and wishes to know the latest revision at the time of the request.
Applications can use `Raft_Term` to detect when the cluster completes a new leader election.
## Key-Value API
The Key-Value API manipulates key-value pairs stored inside etcd. The majority of requests made to etcd are usually key-value requests.
### System primitives
### Key-Value pair
A key-value pair is the smallest unit that the key-value API can manipulate. Each key-value pair has a number of fields, defined in [protobuf format][kv-proto]:
```protobuf
message KeyValue {
bytes key = 1;
int64 create_revision = 2;
int64 mod_revision = 3;
int64 version = 4;
bytes value = 5;
int64 lease = 6;
}
```
* Key - key in bytes. An empty key is not allowed.
* Value - value in bytes.
* Version - version is the version of the key. A deletion resets the version to zero and any modification of the key increases its version.
* Create_Revision - revision of the last creation on the key.
* Mod_Revision - revision of the last modification on the key.
* Lease - the ID of the lease attached to the key. If lease is 0, then no lease is attached to the key.
In addition to just the key and value, etcd attaches additional revision metadata as part of the key message. This revision information orders keys by time of creation and modification, which is useful for managing concurrency for distributed synchronization. The etcd client's [distributed shared locks][locks] use the creation revision to wait for lock ownership. Similarly, the modification revision is used for detecting [software transactional memory][STM] read set conflicts and waiting on [leader election][elections] updates.
#### Revisions
etcd maintains a 64-bit cluster-wide counter, the store revision, that is incremented each time the key space is modified. The revision serves as a global logical clock, sequentially ordering all updates to the store. The change represented by a new revisions is incremental; the data associated with a revision is the data that changed the store. Internally, a new revision means writing the changes to the backend's B+tree, keyed by the incremented revision.
Revisions become more valuable when taking considering etcd3's [multi-version concurrency control][mvcc] backend. The MVCC model means that the key-value store can be viewed from past revisions since historical key revisions are retained. The retention policy for this history can be configured by cluster administrators for fine-grained storage management; usually etcd3 discards old revisions of keys on a timer. A typical etcd3 cluster retains superseded key data for hours. This also buys reliable handling for long client disconnection, not just transient network disruptions: watchers simply resume from the last observed historical revision. Similarly, to read from the store at a particular point-in-time, read requests can be tagged with a revision to return keys from a view of the key space at the point in time that revision was committed.
#### Key ranges
The etcd3 data model indexes all keys over a flat binary key space. This differs from other key-value store systems that use a hierarchical system of organizing keys into directories. Instead of listing keys by directory, keys are listed by key intervals `[a, b)`.
These intervals are often referred to as "ranges" in etcd3. Operations over ranges are more powerful than operations on directories. Like a hierarchical store, intervals support single key lookups via `[a, a+1)` (e.g., ['a', 'a\x00') looks up 'a') and directory lookups by encoding keys by directory depth. In addition to those operations, intervals can also encode prefixes; for example the interval `['a', 'b')` looks up all keys prefixed by the string 'a'.
By convention, ranges for a Request are denoted by the fields `key` and `range_end`. The `key` field is the first key of the range and should be non-empty. The `range_end` is the key following the last key of the range. If `range_end` is not given or empty, the range is defined to contain only the key argument. If `range_end` is `key` plus one (e.g., "aa"+1 == "ab", "a\xff"+1 == "b"), then the range represents all keys prefixed with key. If both `key` and `range_end` are '\0', then range represents all keys. If `range_end` is '\0', the range is all keys greater than or equal to the key argument.
### Range
Keys are fetched from the key-value store using the `Range` API call, which takes a `RangeRequest`:
```protobuf
message RangeRequest {
enum SortOrder {
NONE = 0; // default, no sorting
ASCEND = 1; // lowest target value first
DESCEND = 2; // highest target value first
}
enum SortTarget {
KEY = 0;
VERSION = 1;
CREATE = 2;
MOD = 3;
VALUE = 4;
}
bytes key = 1;
bytes range_end = 2;
int64 limit = 3;
int64 revision = 4;
SortOrder sort_order = 5;
SortTarget sort_target = 6;
bool serializable = 7;
bool keys_only = 8;
bool count_only = 9;
int64 min_mod_revision = 10;
int64 max_mod_revision = 11;
int64 min_create_revision = 12;
int64 max_create_revision = 13;
}
```
* Key, Range_End - The key range to fetch.
* Limit - the maximum number of keys returned for the request. When limit is set to 0, it is treated as no limit.
* Revision - the point-in-time of the key-value store to use for the range. If revision is less or equal to zero, the range is over the latest key-value store If the revision is compacted, ErrCompacted is returned as a response.
* Sort_Order - the ordering for sorted requests.
* Sort_Target - the key-value field to sort.
* Serializable - sets the range request to use serializable member-local reads. By default, Range is linearizable; it reflects the current consensus of the cluster. For better performance and availability, in exchange for possible stale reads, a serializable range request is served locally without needing to reach consensus with other nodes in the cluster.
* Keys_Only - return only the keys and not the values.
* Count_Only - return only the count of the keys in the range.
* Min_Mod_Revision - the lower bound for key mod revisions; filters out lesser mod revisions.
* Max_Mod_Revision - the upper bound for key mod revisions; filters out greater mod revisions.
* Min_Create_Revision - the lower bound for key create revisions; filters out lesser create revisions.
* Max_Create_Revision - the upper bound for key create revisions; filters out greater create revisions.
The client receives a `RangeResponse` message from the `Range` call:
```protobuf
message RangeResponse {
ResponseHeader header = 1;
repeated mvccpb.KeyValue kvs = 2;
bool more = 3;
int64 count = 4;
}
```
* Kvs - the list of key-value pairs matched by the range request. When `Count_Only` is set, `Kvs` is empty.
* More - indicates if there are more keys to return in the requested range if `limit` is set.
* Count - the total number of keys satisfying the range request.
### Put
Keys are saved into the key-value store by issuing a `Put` call, which takes a `PutRequest`:
```protobuf
message PutRequest {
bytes key = 1;
bytes value = 2;
int64 lease = 3;
bool prev_kv = 4;
bool ignore_value = 5;
bool ignore_lease = 6;
}
```
* Key - the name of the key to put into the key-value store.
* Value - the value, in bytes, to associate with the key in the key-value store.
* Lease - the lease ID to associate with the key in the key-value store. A lease value of 0 indicates no lease.
* Prev_Kv - when set, responds with the key-value pair data before the update from this `Put` request.
* Ignore_Value - when set, update the key without changing its current value. Returns an error if the key does not exist.
* Ignore_Lease - when set, update the key without changing its current lease. Returns an error if the key does not exist.
The client receives a `PutResponse` message from the `Put` call:
```protobuf
message PutResponse {
ResponseHeader header = 1;
mvccpb.KeyValue prev_kv = 2;
}
```
* Prev_Kv - the key-value pair overwritten by the `Put`, if `Prev_Kv` was set in the `PutRequest`.
### Delete Range
Ranges of keys are deleted using the `DeleteRange` call, which takes a `DeleteRangeRequest`:
```protobuf
message DeleteRangeRequest {
bytes key = 1;
bytes range_end = 2;
bool prev_kv = 3;
}
```
* Key, Range_End - The key range to delete.
* Prev_Kv - when set, return the contents of the deleted key-value pairs.
The client receives a `DeleteRangeResponse` message from the `DeleteRange` call:
```protobuf
message DeleteRangeResponse {
ResponseHeader header = 1;
int64 deleted = 2;
repeated mvccpb.KeyValue prev_kvs = 3;
}
```
* Deleted - number of keys deleted.
* Prev_Kv - a list of all key-value pairs deleted by the DeleteRange operation.
### Transaction
A transaction is an atomic If/Then/Else construct over the key-value store. It provides a primitive for grouping requests together in atomic blocks (i.e., then/else) whose execution is guarded (i.e., if) based on the contents of the key-value store. Transactions can be used for protecting keys from unintended concurrent updates, building compare-and-swap operations, and developing higher-level concurrency control.
A transaction can atomically process multiple requests in a single request. For modifications to the key-value store, this means the store's revision is incremented only once for the transaction and all events generated by the transaction will have the same revision. However, modifications to the same key multiple times within a single transaction are forbidden.
All transactions are guarded by a conjunction of comparisons, similar to an "If" statement. Each comparison checks a single key in the store. It may check for the absence or presence of a value, compare with a given value, or check a key's revision or version. Two different comparisons may apply to the same or different keys. All comparisons are applied atomically; if all comparisons are true, the transaction is said to succeed and etcd applies the transaction's then / `success` request block, otherwise it is said to fail and applies the else / `failure` request block.
Each comparison is encoded as a `Compare` message:
```protobuf
message Compare {
enum CompareResult {
EQUAL = 0;
GREATER = 1;
LESS = 2;
NOT_EQUAL = 3;
}
enum CompareTarget {
VERSION = 0;
CREATE = 1;
MOD = 2;
VALUE= 3;
}
CompareResult result = 1;
// target is the key-value field to inspect for the comparison.
CompareTarget target = 2;
// key is the subject key for the comparison operation.
bytes key = 3;
oneof target_union {
int64 version = 4;
int64 create_revision = 5;
int64 mod_revision = 6;
bytes value = 7;
}
}
```
* Result - the kind of logical comparison operation (e.g., equal, less than, etc).
* Target - the key-value field to be compared. Either the key's version, create revision, modification revision, or value.
* Key - the key for the comparison.
* Target_Union - the user-specified data for the comparison.
After processing the comparison block, the transaction applies a block of requests. A block is a list of `RequestOp` messages:
```protobuf
message RequestOp {
// request is a union of request types accepted by a transaction.
oneof request {
RangeRequest request_range = 1;
PutRequest request_put = 2;
DeleteRangeRequest request_delete_range = 3;
}
}
```
* Request_Range - a `RangeRequest`.
* Request_Put - a `PutRequest`. The keys must be unique. It may not share keys with any other Puts or Deletes.
* Request_Delete_Range - a `DeleteRangeRequest`. It may not share keys with any Puts or Deletes requests.
All together, a transaction is issued with a `Txn` API call, which takes a `TxnRequest`:
```protobuf
message TxnRequest {
repeated Compare compare = 1;
repeated RequestOp success = 2;
repeated RequestOp failure = 3;
}
```
* Compare - A list of predicates representing a conjunction of terms for guarding the transaction.
* Success - A list of requests to process if all compare tests evaluate to true.
* Failure - A list of requests to process if any compare test evaluates to false.
The client receives a `TxnResponse` message from the `Txn` call:
```protobuf
message TxnResponse {
ResponseHeader header = 1;
bool succeeded = 2;
repeated ResponseOp responses = 3;
}
```
* Succeeded - Whether `Compare` evaluated to true or false.
* Responses - A list of responses corresponding to the results from applying the `Success` block if succeeded is true or the `Failure` if succeeded is false.
The `Responses` list corresponds to the results from the applied `RequestOp` list, with each response encoded as a `ResponseOp`:
```protobuf
message ResponseOp {
oneof response {
RangeResponse response_range = 1;
PutResponse response_put = 2;
DeleteRangeResponse response_delete_range = 3;
}
}
```
## Watch API
The Watch API provides an event-based interface for asynchronously monitoring changes to keys. An etcd3 watch waits for changes to keys by continuously watching from a given revision, either current or historical, and streams key updates back to the client.
### Events
Every change to every key is represented with `Event` messages. An `Event` message provides both the update's data and the type of update:
```protobuf
message Event {
enum EventType {
PUT = 0;
DELETE = 1;
}
EventType type = 1;
KeyValue kv = 2;
KeyValue prev_kv = 3;
}
```
* Type - The kind of event. A PUT type indicates new data has been stored to the key. A DELETE indicates the key was deleted.
* KV - The KeyValue associated with the event. A PUT event contains current kv pair. A PUT event with kv.Version=1 indicates the creation of a key. A DELETE event contains the deleted key with its modification revision set to the revision of deletion.
* Prev_KV - The key-value pair for the key from the revision immediately before the event. To save bandwidth, it is only filled out if the watch has explicitly enabled it.
### Watch streams
Watches are long-running requests and use gRPC streams to stream event data. A watch stream is bi-directional; the client writes to the stream to establish watches and reads to receive watch event. A single watch stream can multiplex many distinct watches by tagging events with per-watch identifiers. This multiplexing helps reducing the memory footprint and connection overhead on the core etcd cluster.
Watches make three guarantees about events:
* Ordered - events are ordered by revision; an event will never appear on a watch if it precedes an event in time that has already been posted.
* Reliable - a sequence of events will never drop any subsequence of events; if there are events ordered in time as a < b < c, then if the watch receives events a and c, it is guaranteed to receive b.
* Atomic - a list of events is guaranteed to encompass complete revisions; updates in the same revision over multiple keys will not be split over several lists of events.
A client creates a watch by sending a `WatchCreateRequest` over a stream returned by `Watch`:
```protobuf
message WatchCreateRequest {
bytes key = 1;
bytes range_end = 2;
int64 start_revision = 3;
bool progress_notify = 4;
enum FilterType {
NOPUT = 0;
NODELETE = 1;
}
repeated FilterType filters = 5;
bool prev_kv = 6;
}
```
* Key, Range_End - The key range to watch.
* Start_Revision - An optional revision for where to inclusively begin watching. If not given, it will stream events following the revision of the watch creation response header revision. The entire available event history can be watched starting from the last compaction revision.
* Progress_Notify - When set, the watch will periodically receive a WatchResponse with no events, if there are no recent events. It is useful when clients wish to recover a disconnected watcher starting from a recent known revision. The etcd server decides how often to send notifications based on current server load.
* Filters - A list of event types to filter away at server side.
* Prev_Kv - When set, the watch receives the key-value data from before the event happens. This is useful for knowing what data has been overwritten.
In response to a `WatchCreateRequest` or if there is a new event for some established watch, the client receives a `WatchResponse`:
```protobuf
message WatchResponse {
ResponseHeader header = 1;
int64 watch_id = 2;
bool created = 3;
bool canceled = 4;
int64 compact_revision = 5;
repeated mvccpb.Event events = 11;
}
```
* Watch_ID - the ID of the watch that corresponds to the response.
* Created - set to true if the response is for a create watch request. The client should record ID and expect to receive events for the watch on the stream. All events sent to the created watcher will have the same watch_id.
* Canceled - set to true if the response is for a cancel watch request. No further events will be sent to the canceled watcher.
* Compact_Revision - set to the minimum historical revision available to etcd if a watcher tries watching at a compacted revision. This happens when creating a watcher at a compacted revision or the watcher cannot catch up with the progress of the key-value store. The watcher will be canceled; creating new watches with the same start_revision will fail.
* Events - a list of new events in sequence corresponding to the given watch ID.
If the client wishes to stop receiving events for a watch, it issues a `WatchCancelRequest`:
```protobuf
message WatchCancelRequest {
int64 watch_id = 1;
}
```
* Watch_ID - the ID of the watch to cancel so that no more events are transmitted.
## Lease API
Leases are a mechanism for detecting client liveness. The cluster grants leases with a time-to-live. A lease expires if the etcd cluster does not receive a keepAlive within a given TTL period.
To tie leases into the key-value store, each key may be attached to at most one lease. When a lease expires or is revoked, all keys attached to that lease will be deleted. Each expired key generates a delete event in the event history.
### Obtaining leases
Leases are obtained through the `LeaseGrant` API call, which takes a `LeaseGrantRequest`:
```protobuf
message LeaseGrantRequest {
int64 TTL = 1;
int64 ID = 2;
}
```
* TTL - the advisory time-to-live, in seconds.
* ID - the requested ID for the lease. If ID is set to 0, etcd will choose an ID.
The client receives a `LeaseGrantResponse` from the `LeaseGrant` call:
```protobuf
message LeaseGrantResponse {
ResponseHeader header = 1;
int64 ID = 2;
int64 TTL = 3;
}
```
* ID - the lease ID for the granted lease.
* TTL - is the server selected time-to-live, in seconds, for the lease.
```protobuf
message LeaseRevokeRequest {
int64 ID = 1;
}
```
* ID - the lease ID to revoke. When the lease is revoked, all attached keys are deleted.
### Keep alives
Leases are refreshed using a bi-directional stream created with the `LeaseKeepAlive` API call. When the client wishes to refresh a lease, it sends a `LeaseGrantRequest` over the stream:
```protobuf
message LeaseKeepAliveRequest {
int64 ID = 1;
}
```
* ID - the lease ID for the lease to keep alive.
The keep alive stream responds with a `LeaseKeepAliveResponse`:
```protobuf
message LeaseKeepAliveResponse {
ResponseHeader header = 1;
int64 ID = 2;
int64 TTL = 3;
}
```
* ID - the lease that was refreshed with a new TTL.
* TTL - the new time-to-live, in seconds, that the lease has remaining.
[elections]: https://github.com/coreos/etcd/blob/master/clientv3/concurrency/election.go
[kv-proto]: https://github.com/coreos/etcd/blob/master/mvcc/mvccpb/kv.proto
[grpc-api]: ../dev-guide/api_reference_v3.md
[grpc-service]: https://github.com/coreos/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto
[locks]: https://github.com/coreos/etcd/blob/master/clientv3/concurrency/mutex.go
[mvcc]: https://en.wikipedia.org/wiki/Multiversion_concurrency_control
[stm]: https://github.com/coreos/etcd/blob/master/clientv3/concurrency/stm.go

View File

@ -0,0 +1,64 @@
# KV API guarantees
etcd is a consistent and durable key value store with [mini-transaction][txn] support. The key value store is exposed through the KV APIs. etcd tries to ensure the strongest consistency and durability guarantees for a distributed system. This specification enumerates the KV API guarantees made by etcd.
### APIs to consider
* Read APIs
* range
* watch
* Write APIs
* put
* delete
* Combination (read-modify-write) APIs
* txn
### etcd specific definitions
#### Operation completed
An etcd operation is considered complete when it is committed through consensus, and therefore “executed” -- permanently stored -- by the etcd storage engine. The client knows an operation is completed when it receives a response from the etcd server. Note that the client may be uncertain about the status of an operation if it times out, or there is a network disruption between the client and the etcd member. etcd may also abort operations when there is a leader election. etcd does not send `abort` responses to clients outstanding requests in this event.
#### Revision
An etcd operation that modifies the key value store is assigned a single increasing revision. A transaction operation might modify the key value store multiple times, but only one revision is assigned. The revision attribute of a key value pair that was modified by the operation has the same value as the revision of the operation. The revision can be used as a logical clock for key value store. A key value pair that has a larger revision is modified after a key value pair with a smaller revision. Two key value pairs that have the same revision are modified by an operation "concurrently".
### Guarantees provided
#### Atomicity
All API requests are atomic; an operation either completes entirely or not at all. For watch requests, all events generated by one operation will be in one watch response. Watch never observes partial events for a single operation.
#### Consistency
All API calls ensure [sequential consistency][seq_consistency], the strongest consistency guarantee available from distributed systems. No matter which etcd member server a client makes requests to, a client reads the same events in the same order. If two members complete the same number of operations, the state of the two members is consistent.
For watch operations, etcd guarantees to return the same value for the same key across all members for the same revision. For range operations, etcd has a similar guarantee for [linearized][Linearizability] access; serialized access may be behind the quorum state, so that the later revision is not yet available.
As with all distributed systems, it is impossible for etcd to ensure [strict consistency][strict_consistency]. etcd does not guarantee that it will return to a read the “most recent” value (as measured by a wall clock when a request is completed) available on any cluster member.
#### Isolation
etcd ensures [serializable isolation][serializable_isolation], which is the highest isolation level available in distributed systems. Read operations will never observe any intermediate data.
#### Durability
Any completed operations are durable. All accessible data is also durable data. A read will never return data that has not been made durable.
#### Linearizability
Linearizability (also known as Atomic Consistency or External Consistency) is a consistency level between strict consistency and sequential consistency.
For linearizability, suppose each operation receives a timestamp from a loosely synchronized global clock. Operations are linearized if and only if they always complete as though they were executed in a sequential order and each operation appears to complete in the order specified by the program. Likewise, if an operations timestamp precedes another, that operation must also precede the other operation in the sequence.
For example, consider a client completing a write at time point 1 (*t1*). A client issuing a read at *t2* (for *t2* > *t1*) should receive a value at least as recent as the previous write, completed at *t1*. However, the read might actually complete only by *t3*, and the returned value, current at *t2* when the read began, might be "stale" by *t3*.
etcd does not ensure linearizability for watch operations. Users are expected to verify the revision of watch responses to ensure correct ordering.
etcd ensures linearizability for all other operations by default. Linearizability comes with a cost, however, because linearized requests must go through the Raft consensus process. To obtain lower latencies and higher throughput for read requests, clients can configure a requests consistency mode to `serializable`, which may access stale data with respect to quorum, but removes the performance penalty of linearized accesses' reliance on live consensus.
[seq_consistency]: https://en.wikipedia.org/wiki/Consistency_model#Sequential_consistency
[strict_consistency]: https://en.wikipedia.org/wiki/Consistency_model#Strict_consistency
[serializable_isolation]: https://en.wikipedia.org/wiki/Isolation_(database_systems)#Serializable
[Linearizability]: #Linearizability
[txn]: api.md#transactions

View File

@ -0,0 +1,77 @@
# etcd v3 authentication design
## Why not reuse the v2 auth system?
The v3 protocol uses gRPC as its transport instead of a RESTful interface like v2. This new protocol provides an opportunity to iterate on and improve the v2 design. For example, v3 auth has connection based authentication, rather than v2's slower per-request authentication. Additionally, v2 auth's semantics tend to be unwieldy in practice with respect to reasoning about consistency, which will be described in the next sections. For v3, there is a well-defined description and implementation of the authentication mechanism which fixes the deficiencies in the v2 auth system.
### Functionality requirements
* Per connection authentication, not per request
* User ID + password based authentication implemented for the gRPC API
* Authentication must be refreshed after auth policy changes
* Its functionality should be as simple and useful as v2
* v3 provides a flat key space, unlike the directory structure of v2. Permission checking will be provided as interval matching.
* It should have stronger consistency guarantees than v2 auth
### Main required changes
* A client must create a dedicated connection only for authentication before sending authenticated requests
* Add permission information (user ID and authorized revision) to the Raft commands (`etcdserverpb.InternalRaftRequest`)
* Every request is permission checked in the state machine layer, rather than API layer
### Permission metadata consistency
The metadata for auth should also be stored and managed in the storage controlled by etcd's Raft protocol like other data stored in etcd. It is required for not sacrificing availability and consistency of the entire etcd cluster. If reading or writing the metadata (e.g. permission information) needs an agreement of every node (more than quorum), single node failure can stop the entire cluster. Requiring all nodes to agree at once means that checking ordinary read/write requests cannot be completed if any cluster member is down, even if the cluster has an available quorum. This unanimous scheme ultimately degrades cluster availability; quorum based consensus from raft should suffice since agreement follows from consistent ordering.
The authentication mechanism in the etcd v2 protocol has a tricky part because the metadata consistency should work as in the above, but does not: each permission check is processed by the etcd member that receives the client request (etcdserver/api/v2http/client.go), including follower members. Therefore, it's possible the check may be based on stale metadata.
This staleness means that auth configuration cannot be reflected as soon as operators execute etcdctl. Therefore there is no way to know how long the stale metadata is active. Practically, the configuration change is reflected immediately after the command execution. However, in some cases of heavy load, the inconsistent state can be prolonged and it might result in counter-intuitive situations for users and developers. It requires a workaround like this: https://github.com/coreos/etcd/pull/4317#issuecomment-179037582
### Inconsistent permissions are unsafe for linearized requests
Inconsistent authentication state is most serious for writes. Even if an operator disables write on a user, if the write is only ordered with respect to the key value store but not the authentication system, it's possible the write will complete successfully. Without ordering on both the auth store and the key-value store, the system will be susceptible to stale permission attacks.
Therefore, the permission checking logic should be added to the state machine of etcd. Each state machine should check the requests based on its permission information in the apply phase (so the auth information must not be stale).
## Design and implementation
### Authentication
At first, a client must create a gRPC connection only to authenticate its user ID and password. An etcd server will respond with an authentication reply. The reponse will be an authentication token on success or an error on failure. The client can use its authentication token to present its credentials to etcd when making API requests.
The client connection used to request the authentication token is typically thrown away; it cannot carry the new token's credentials. This is because gRPC doesn't provide a way for adding per RPC credential after creation of the connection (calling `grpc.Dial()`). Therefore, a client cannot assign a token to its connection that is obtained through the connection. The client needs a new connection for using the token.
#### Notes on the implementation of `Authenticate()` RPC
`Authenticate()` RPC generates an authentication token based on a given user name and password. etcd saves and checks a configured password and a given password using Go's `bcrypt` package. By design, `bcrypt`'s password checking mechanism is computationally expensive, taking nearly 100ms on an ordinary x64 server. Therefore, performing this check in the state machine apply phase would cause performance trouble: the entire etcd cluster can only serve almost 10 `Authenticate()` requests per second.
For good performance, the v3 auth mechanism checks passwords in etcd's API layer, where it can be parallelized outside of raft. However, this can lead to potential time-of-check/time-of-use (TOCTOU) permission lapses:
1. client A sends a request `Authenticate()`
1. the API layer processes the password checking part of `Authenticate()`
1. another client B sends a request of `ChangePassword()` and the server completes it
1. the state machine layer processes the part of getting a revision number for the `Authenticate()` from A
1. the server returns a success to A
1. now A is authenticated on an obsolete password
For avoiding such a situation, the API layer performs *version number validation* based on the revision number of the auth store. During password checking, the API layer saves the revision number of auth store. After successful password checking, the API layer compares the saved revision number and the latest revision number. If the numbers differ, it means someone else updated the auth metadata. So it retries the checking. With this mechanism, the successful password checking based on the obsolete password can be avoided.
### Resolving a token in the API layer
After authenticating with `Authenticate()`, a client can create a gRPC connection as it would without auth. In addition to the existing initialization process, the client must associate the token with the newly created connection. `grpc.WithPerRPCCredentials()` provides the functionality for this purpose.
Every authenticated request from the client has a token. The token can be obtained with `grpc.metadata.FromContext()` in the server side. The server can obtain who is issuing the request and when the user was authorized. The information will be filled by the API layer in the header (`etcdserverpb.RequestHeader.Username` and `etcdserverpb.RequestHeader.AuthRevision`) of a raft log entry (`etcdserverpb.InternalRaftRequest`).
### Checking permission in the state machine
The auth info in `etcdserverpb.RequestHeader` is checked in the apply phase of the state machine. This step checks the user is granted permission to requested keys on the latest revision of auth store.
### Two types of tokens: simple and JWT
There are two kinds of token types: simple and JWT. The simple token isn't designed for production use cases. Its tokens aren't cryptographically signed and servers must statefully track token-user correspondence; it is meant for development testing. JWT tokens should be used for production deployments since it is cryptographically signed and verified. From the implementation perspective, JWT is stateless. Its token can include metadata including username and revision, so servers don't need to remember correspondence between tokens and the metadata.
## Notes on the difference between KVS models and file system models
etcd v3 is a KVS, not a file system. So the permissions can be granted to the users in form of an exact key name or a key range like `["start key", "end key")`. It means that granting a permission of a nonexistent key is possible. Users should care about unintended permission granting. In a case of file system like system (e.g. Chubby or ZooKeeper), an inode like data structure can include the permission information. So granting permission to a nonexist key won't be possible (except the case of sticky bits).
The etcd v3 model requires multiple lookup of the metadata unlike the file system like systems. The worst case lookup cost will be sum the user's total granted keys and intervals. The cost cannot be avoided because v3's flat key space is completely different from Unix's file system model (every inode includes permission metadata). Practically the cost wont be a serious problem because the metadata is small enough to benefit from caching.

View File

@ -0,0 +1,25 @@
# Data model
etcd is designed to reliably store infrequently updated data and provide reliable watch queries. etcd exposes previous versions of key-value pairs to support inexpensive snapshots and watch history events (“time travel queries”). A persistent, multi-version, concurrency-control data model is a good fit for these use cases.
etcd stores data in a multiversion [persistent][persistent-ds] key-value store. The persistent key-value store preserves the previous version of a key-value pair when its value is superseded with new data. The key-value store is effectively immutable; its operations do not update the structure in-place, but instead always generates a new updated structure. All past versions of keys are still accessible and watchable after modification. To prevent the data store from growing indefinitely over time from maintaining old versions, the store may be compacted to shed the oldest versions of superseded data.
### Logical view
The stores logical view is a flat binary key space. The key space has a lexically sorted index on byte string keys so range queries are inexpensive.
The key space maintains multiple revisions. Each atomic mutative operation (e.g., a transaction operation may contain multiple operations) creates a new revision on the key space. All data held by previous revisions remains unchanged. Old versions of key can still be accessed through previous revisions. Likewise, revisions are indexed as well; ranging over revisions with watchers is efficient. If the store is compacted to recover space, revisions before the compact revision will be removed.
A keys lifetime spans a generation. Each key may have one or multiple generations. Creating a key increments the generation of that key, starting at 1 if the key never existed. Deleting a key generates a key tombstone, concluding the keys current generation. Each modification of a key creates a new version of the key. Once a compaction happens, any generation ended before the given revision will be removed and values set before the compaction revision except the latest one will be removed.
### Physical view
etcd stores the physical data as key-value pairs in a persistent [b+tree][b+tree]. Each revision of the stores state only contains the delta from its previous revision to be efficient. A single revision may correspond to multiple keys in the tree.
The key of key-value pair is a 3-tuple (major, sub, type). Major is the store revision holding the key. Sub differentiates among keys within the same revision. Type is an optional suffix for special value (e.g., `t` if the value contains a tombstone). The value of the key-value pair contains the modification from previous revision, thus one delta from previous revision. The b+tree is ordered by key in lexical byte-order. Ranged lookups over revision deltas are fast; this enables quickly finding modifications from one specific revision to another. Compaction removes out-of-date keys-value pairs.
etcd also keeps a secondary in-memory [btree][btree] index to speed up range queries over keys. The keys in the btree index are the keys of the store exposed to user. The value is a pointer to the modification of the persistent b+tree. Compaction removes dead pointers.
[persistent-ds]: https://en.wikipedia.org/wiki/Persistent_data_structure
[btree]: https://en.wikipedia.org/wiki/B-tree
[b+tree]: https://en.wikipedia.org/wiki/B%2B_tree

View File

@ -0,0 +1,97 @@
# Glossary
This document defines the various terms used in etcd documentation, command line and source code.
## Alarm
The etcd server raises an alarm whenever the cluster needs operator intervention to remain reliable.
## Authentication
Authentication manages user access permissions for etcd resources.
## Client
A client connects to the etcd cluster to issue service requests such as fetching key-value pairs, writing data, or watching for updates.
## Cluster
Cluster consists of several members.
The node in each member follows raft consensus protocol to replicate logs. Cluster receives proposals from members, commits them and apply to local store.
## Compaction
Compaction discards all etcd event history and superseded keys prior to a given revision. It is used to reclaim storage space in the etcd backend database.
## Election
The etcd cluster holds elections among its members to choose a leader as part of the raft consensus protocol.
## Endpoint
A URL pointing to an etcd service or resource.
## Key
A user-defined identifier for storing and retrieving user-defined values in etcd.
## Key range
A set of keys containing either an individual key, a lexical interval for all x such that a < x <= b, or all keys greater than a given key.
## Keyspace
The set of all keys in an etcd cluster.
## Lease
A short-lived renewable contract that deletes keys associated with it on its expiry.
## Member
A logical etcd server that participates in serving an etcd cluster.
## Modification Revision
The first revision to hold the last write to a given key.
## Peer
Peer is another member of the same cluster.
## Proposal
A proposal is a request (for example a write request, a configuration change request) that needs to go through raft protocol.
## Quorum
The number of active members needed for consensus to modify the cluster state. etcd requires a member majority to reach quorum.
## Revision
A 64-bit cluster-wide counter that is incremented each time the keyspace is modified.
## Role
A unit of permissions over a set of key ranges which may be granted to a set of users for access control.
## Snapshot
A point-in-time backup of the etcd cluster state.
## Store
The physical storage backing the cluster keyspace.
## Transaction
An atomically executed set of operations. All modified keys in a transaction share the same modification revision.
## Key Version
The number of writes to a key since it was created, starting at 1. The version of a nonexistent or deleted key is 0.
## Watcher
A client opens a watcher to observe updates on a given key range.

View File

@ -0,0 +1,116 @@
# Why etcd
The name "etcd" originated from two ideas, the unix "/etc" folder and "d"istibuted systems. The "/etc" folder is a place to store configuration data for a single system whereas etcd stores configuration information for large scale distributed systems. Hence, a "d"istributed "/etc" is "etcd".
etcd stores metadata in a consistent and fault-tolerant way. Distributed systems use etcd as a consistent key-value store for configuration management, service discovery, and coordinating distributed work. Common distributed patterns using etcd include [leader election][etcd-etcdctl-elect], [distributed locks][etcd-etcdctl-lock], and monitoring machine liveness.
## Use cases
- Container Linux by CoreOS: Application running on [Container Linux][container-linux] gets automatic, zero-downtime Linux kernel updates. Container Linux uses [locksmith] to coordinate updates. locksmith implements a distributed semaphore over etcd to ensure only a subset of a cluster is rebooting at any given time.
- [Kubernetes][kubernetes] stores configuration data into etcd for service discovery and cluster management; etcd's consistency is crucial for correctly scheduling and operating services. The Kubernetes API server persists cluster state into etcd. It uses etcd's watch API to monitor the cluster and roll out critical configuration changes.
## etcd versus other key-value stores
When deciding whether to use etcd as a key-value store, its worth keeping in mind etcds main goal. Namely, etcd is designed as a general substrate for large scale distributed systems. These are systems that will never tolerate split-brain operation and are willing to sacrifice availability to achieve this end. An etcd cluster is meant to provide consistent key-value storage with best of class stability, reliability, scalability and performance. The upshot of this focus is many [organizations][production-users] already use etcd to implement production systems such as container schedulers, service discovery services, distributed data storage, and more.
Perhaps etcd already seems like a good fit, but as with all technological decisions, proceed with caution. Please note this documentation is written by the etcd team. Although the ideal is a disinterested comparison of technology and features, the authors expertise and biases obviously favor etcd. Use only as directed.
The table below is a handy quick reference for spotting the differences among etcd and its most popular alternatives at a glance. Further commentary and details for each column are in the sections following the table.
| | etcd | ZooKeeper | Consul | NewSQL (Cloud Spanner, CockroachDB, TiDB) |
| --- | --- | --- | --- | --- |
| Concurrency Primitives | [Lock RPCs][etcd-v3lock], [Election RPCs][etcd-v3election], [command line locks][etcd-etcdctl-lock], [command line elections][etcd-etcdctl-elect], [recipes][etcd-recipe] in go | External [curator recipes][curator] in Java | [Native lock API][consul-lock] | [Rare][newsql-leader], if any |
| Linearizable Reads | [Yes][etcd-linread] | No | [Yes][consul-linread] | Sometimes |
| Multi-version Concurrency Control | [Yes][etcd-mvcc] | No | No | Sometimes |
| Transactions | [Field compares, Read, Write][etcd-txn] | [Version checks, Write][zk-txn] | [Field compare, Lock, Read, Write][consul-txn] | SQL-style |
| Change Notification | [Historical and current key intervals][etcd-watch] | [Current keys and directories][zk-watch] | [Current keys and prefixes][consul-watch] | Triggers (sometimes) |
| User permissions | [Role based][etcd-rbac] | [ACLs][zk-acl] | [ACLs][consul-acl] | Varies (per-table [GRANT][cockroach-grant], per-database [roles][spanner-roles]) |
| HTTP/JSON API | [Yes][etcd-json] | No | [Yes][consul-json] | Rarely |
| Membership Reconfiguration | [Yes][etcd-reconfig] | [>3.5.0][zk-reconfig] | [Yes][consul-reconfig] | Yes |
| Maximum reliable database size | Several gigabytes | Hundreds of megabytes (sometimes several gigabytes) | Hundreds of MBs | Terabytes+ |
| Minimum read linearization latency | Network RTT | No read linearization | RTT + fsync | Clock barriers (atomic, NTP) |
### ZooKeeper
ZooKeeper solves the same problem as etcd: distributed system coordination and metadata storage. However, etcd has the luxury of hindsight taken from engineering and operational experience with ZooKeepers design and implementation. The lessons learned from Zookeeper certainly informed etcds design, helping it support large scale systems like Kubernetes. The improvements etcd made over Zookeeper include:
* Dynamic cluster membership reconfiguration
* Stable read/write under high load
* A multi-version concurrency control data model
* Reliable key monitoring which never silently drop events
* Lease primitives decoupling connections from sessions
* APIs for safe distributed shared locks
Furthermore, etcd supports a wide range of languages and frameworks out of the box. Whereas Zookeeper has its own custom Jute RPC protocol, which is totally unique to Zookeeper and limits its [supported language bindings][zk-bindings], etcds client protocol is built from [gRPC][grpc], a popular RPC framework with language bindings for go, C++, Java, and more. Likewise, gRPC can be serialized into JSON over HTTP, so even general command line utilities like `curl` can talk to it. Since systems can select from a variety of choices, they are built on etcd with native tooling rather than around etcd with a single fixed set of technologies.
When considering features, support, and stability, new applications planning to use Zookeeper for a consistent key value store would do well to choose etcd instead.
### Consul
Consul bills itself as an end-to-end service discovery framework. To wit, it includes services such as health checking, failure detection, and DNS. Incidentally, Consul also exposes a key value store with mediocre performance and an intricate API. As it stands in Consul 0.7, the storage system does not scales well; systems requiring millions of keys will suffer from high latencies and memory pressure. The key value API is missing, most notably, multi-version keys, conditional transactions, and reliable streaming watches.
etcd and Consul solve different problems. If looking for a distributed consistent key value store, etcd is a better choice over Consul. If looking for end-to-end cluster service discovery, etcd will not have enough features; choose Kubernetes, Consul, or SmartStack.
### NewSQL (Cloud Spanner, CockroachDB, TiDB)
Both etcd and NewSQL databases (e.g., [Cockroach][cockroach], [TiDB][tidb], [Google Spanner][spanner]) provide strong data consistency guarantees with high availability. However, the significantly different system design parameters lead to significantly different client APIs and performance characteristics.
NewSQL databases are meant to horizontally scale across data centers. These systems typically partition data across multiple consistent replication groups (shards), potentially distant, storing data sets on the order of terabytes and above. This sort of scaling makes them poor candidates for distributed coordination as they have long latencies from waiting on clocks and expect updates with mostly localized dependency graphs. The data is organized into tables, including SQL-style query facilities with richer semantics than etcd, but at the cost of additional complexity for processing, planning, and optimizing queries.
In short, choose etcd for storing metadata or coordinating distributed applications. If storing more than a few GB of data or if full SQL queries are needed, choose a NewSQL database.
## Using etcd for metadata
etcd replicates all data within a single consistent replication group. For storing up to a few GB of data with consistent ordering, this is the most efficient approach. Each modification of cluster state, which may change multiple keys, is assigned a global unique ID, called a revision in etcd, from a monotonically increasing counter for reasoning over ordering. Since theres only a single replication group, the modification request only needs to go through the raft protocol to commit. By limiting consensus to one replication group, etcd gets distributed consistency with a simple protocol while achieving low latency and high throughput.
The replication behind etcd cannot horizontally scale because it lacks data sharding. In contrast, NewSQL databases usually shard data across multiple consistent replication groups, storing data sets on the order of terabytes and above. However, to assign each modification a global unique and increasing ID, each request must go through an additional coordination protocol among replication groups. This extra coordination step may potentially conflict on the global ID, forcing ordered requests to retry. The result is a more complicated approach with typically worse performance than etcd for strict ordering.
If an application reasons primarily about metadata or metadata ordering, such as to coordinate processes, choose etcd. If the application needs a large data store spanning multiple data centers and does not heavily depend on strong global ordering properties, choose a NewSQL database.
## Using etcd for distributed coordination
etcd has distributed coordination primitives such as event watches, leases, elections, and distributed shared locks out of the box. These primitives are both maintained and supported by the etcd developers; leaving these primitives to external libraries shirks the responsibility of developing foundational distributed software, essentially leaving the system incomplete. NewSQL databases usually expect these distributed coordination primitives to be authored by third parties. Likewise, ZooKeeper famously has a separate and independent [library][curator] of coordination recipes. Consul, which provides a native locking API, goes so far as to apologize that its “[not a bulletproof method][consul-bulletproof]”.
In theory, its possible to build these primitives atop any storage systems providing strong consistency. However, the algorithms tend to be subtle; it is easy to develop a locking algorithm that appears to work, only to suddenly break due to thundering herd and timing skew. Furthermore, other primitives supported by etcd, such as transactional memory depend on etcds MVCC data model; simple strong consistency is not enough.
For distributed coordination, choosing etcd can help prevent operational headaches and save engineering effort.
[production-users]: ../production-users.md
[grpc]: http://www.grpc.io
[consul-bulletproof]: https://www.consul.io/docs/internals/sessions.html
[curator]: http://curator.apache.org/
[cockroach]: https://github.com/cockroachdb/cockroach
[spanner]: https://cloud.google.com/spanner/
[tidb]: https://github.com/pingcap/tidb
[etcd-v3lock]: https://godoc.org/github.com/coreos/etcd/etcdserver/api/v3lock/v3lockpb
[etcd-v3election]: https://godoc.org/github.com/coreos/etcd/etcdserver/api/v3election/v3electionpb
[etcd-etcdctl-lock]: ../../etcdctl/README.md#lock-lockname
[etcd-etcdctl-elect]: ../../etcdctl/README.md#elect-options-election-name-proposal
[etcd-mvcc]: data_model.md
[etcd-recipe]: https://godoc.org/github.com/coreos/etcd/contrib/recipes
[consul-lock]: https://www.consul.io/docs/commands/lock.html
[newsql-leader]: http://dl.acm.org/citation.cfm?id=2960999
[etcd-reconfig]: ../op-guide/runtime-configuration.md
[zk-reconfig]: https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html
[consul-reconfig]: https://www.consul.io/docs/guides/servers.html
[etcd-linread]: api_guarantees.md#linearizability
[consul-linread]: https://www.consul.io/docs/agent/http.html#consistency
[etcd-json]: ../dev-guide/api_grpc_gateway.md
[consul-json]: https://www.consul.io/docs/agent/http.html#formatted-json-output
[etcd-txn]: api.md#transaction
[zk-txn]: https://zookeeper.apache.org/doc/r3.4.3/api/org/apache/zookeeper/ZooKeeper.html#multi(java.lang.Iterable)
[consul-txn]: https://www.consul.io/docs/agent/http/kv.html#txn
[etcd-watch]: api.md#watch-streams
[zk-watch]: https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#ch_zkWatches
[consul-watch]: https://www.consul.io/docs/agent/watches.html
[etcd-commonname]: ../op-guide/authentication.md#using-tls-common-name
[etcd-rbac]: ../op-guide/authentication.md#working-with-roles
[zk-acl]: https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl
[consul-acl]: https://www.consul.io/docs/internals/acl.html
[cockroach-grant]: https://www.cockroachlabs.com/docs/grant.html
[spanner-roles]: https://cloud.google.com/spanner/docs/iam#roles
[zk-bindings]: https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#ch_bindings
[container-linux]: https://coreos.com/why
[locksmith]: https://github.com/coreos/locksmith
[kubernetes]: http://kubernetes.io/docs/whatisk8s

View File

@ -1,134 +1,116 @@
# Metrics
**NOTE: The metrics feature is considered experimental. We may add/change/remove metrics without warning in future releases.**
etcd uses [Prometheus][prometheus] for metrics reporting. The metrics can be used for real-time monitoring and debugging. etcd does not persist its metrics; if a member restarts, the metrics will be reset.
etcd uses [Prometheus][prometheus] for metrics reporting in the server. The metrics can be used for real-time monitoring and debugging.
etcd only stores these data in memory. If a member restarts, metrics will reset.
The simplest way to see the available metrics is to cURL the metrics endpoint `/metrics` of etcd. The format is described [here](http://prometheus.io/docs/instrumenting/exposition_formats/).
The simplest way to see the available metrics is to cURL the metrics endpoint `/metrics`. The format is described [here](http://prometheus.io/docs/instrumenting/exposition_formats/).
Follow the [Prometheus getting started doc][prometheus-getting-started] to spin up a Prometheus server to collect etcd metrics.
The naming of metrics follows the suggested [best practice of Prometheus][prometheus-naming]. A metric name has an `etcd` prefix as its namespace and a subsystem prefix (for example `wal` and `etcdserver`).
The naming of metrics follows the suggested [Prometheus best practices][prometheus-naming]. A metric name has an `etcd` or `etcd_debugging` prefix as its namespace and a subsystem prefix (for example `wal` and `etcdserver`).
etcd now exposes the following metrics:
## etcd namespace metrics
## etcdserver
The metrics under the `etcd` prefix are for monitoring and alerting. They are stable high level metrics. If there is any change of these metrics, it will be included in release notes.
| Name | Description | Type |
|-----------------------------------------|--------------------------------------------------|-----------|
| file_descriptors_used_total | The total number of file descriptors used | Gauge |
| proposal_durations_seconds | The latency distributions of committing proposal | Histogram |
| pending_proposal_total | The total number of pending proposals | Gauge |
| proposal_failed_total | The total number of failed proposals | Counter |
Metrics that are etcd2 related are documented [v2 metrics guide][v2-http-metrics].
High file descriptors (`file_descriptors_used_total`) usage (near the file descriptors limitation of the process) indicates a potential out of file descriptors issue. That might cause etcd fails to create new WAL files and panics.
### Server
[Proposal][glossary-proposal] durations (`proposal_durations_seconds`) provides a histogram about the proposal commit latency. Latency can be introduced into this process by network and disk IO.
These metrics describe the status of the etcd server. In order to detect outages or problems for troubleshooting, the server metrics of every production etcd cluster should be closely monitored.
Pending proposal (`pending_proposal_total`) gives you an idea about how many proposal are in the queue and waiting for commit. An increasing pending number indicates a high client load or an unstable cluster.
All these metrics are prefixed with `etcd_server_`
Failed proposals (`proposal_failed_total`) are normally related to two issues: temporary failures related to a leader election or longer duration downtime caused by a loss of quorum in the cluster.
| Name | Description | Type |
|---------------------------|----------------------------------------------------------|---------|
| has_leader | Whether or not a leader exists. 1 is existence, 0 is not.| Gauge |
| leader_changes_seen_total | The number of leader changes seen. | Counter |
| proposals_committed_total | The total number of consensus proposals committed. | Gauge |
| proposals_applied_total | The total number of consensus proposals applied. | Gauge |
| proposals_pending | The current number of pending proposals. | Gauge |
| proposals_failed_total | The total number of failed proposals seen. | Counter |
## wal
`has_leader` indicates whether the member has a leader. If a member does not have a leader, it is
totally unavailable. If all the members in the cluster do not have any leader, the entire cluster
is totally unavailable.
| Name | Description | Type |
|------------------------------------|--------------------------------------------------|-----------|
| fsync_durations_seconds | The latency distributions of fsync called by wal | Histogram |
| last_index_saved | The index of the last entry saved by wal | Gauge |
`leader_changes_seen_total` counts the number of leader changes the member has seen since its start. Rapid leadership changes impact the performance of etcd significantly. It also signals that the leader is unstable, perhaps due to network connectivity issues or excessive load hitting the etcd cluster.
Abnormally high fsync duration (`fsync_durations_seconds`) indicates disk issues and might cause the cluster to be unstable.
`proposals_committed_total` records the total number of consensus proposals committed. This gauge should increase over time if the cluster is healthy. Several healthy members of an etcd cluster may have different total committed proposals at once. This discrepancy may be due to recovering from peers after starting, lagging behind the leader, or being the leader and therefore having the most commits. It is important to monitor this metric across all the members in the cluster; a consistently large lag between a single member and its leader indicates that member is slow or unhealthy.
`proposals_applied_total` records the total number of consensus proposals applied. The etcd server applies every committed proposal asynchronously. The difference between `proposals_committed_total` and `proposals_applied_total` should usually be small (within a few thousands even under high load). If the difference between them continues to rise, it indicates that the etcd server is overloaded. This might happen when applying expensive queries like heavy range queries or large txn operations.
`proposals_pending` indicates how many proposals are queued to commit. Rising pending proposals suggests there is a high client load or the member cannot commit proposals.
`proposals_failed_total` are normally related to two issues: temporary failures related to a leader election or longer downtime caused by a loss of quorum in the cluster.
### Disk
These metrics describe the status of the disk operations.
All these metrics are prefixed with `etcd_disk_`.
| Name | Description | Type |
|------------------------------------|-------------------------------------------------------|-----------|
| wal_fsync_duration_seconds | The latency distributions of fsync called by wal | Histogram |
| backend_commit_duration_seconds | The latency distributions of commit called by backend.| Histogram |
A `wal_fsync` is called when etcd persists its log entries to disk before applying them.
A `backend_commit` is called when etcd commits an incremental snapshot of its most recent changes to disk.
High disk operation latencies (`wal_fsync_duration_seconds` or `backend_commit_duration_seconds`) often indicate disk issues. It may cause high request latency or make the cluster unstable.
### Network
These metrics describe the status of the network.
All these metrics are prefixed with `etcd_network_`
| Name | Description | Type |
|---------------------------|--------------------------------------------------------------------|---------------|
| peer_sent_bytes_total | The total number of bytes sent to the peer with ID `To`. | Counter(To) |
| peer_received_bytes_total | The total number of bytes received from the peer with ID `From`. | Counter(From) |
| peer_sent_failures_total | The total number of send failures from the peer with ID `To`. | Counter(To) |
| peer_received_failures_total | The total number of receive failures from the peer with ID `From`. | Counter(From) |
| peer_round_trip_time_seconds | Round-Trip-Time histogram between peers. | Histogram(To) |
| client_grpc_sent_bytes_total | The total number of bytes sent to grpc clients. | Counter |
| client_grpc_received_bytes_total| The total number of bytes received to grpc clients. | Counter |
`peer_sent_bytes_total` counts the total number of bytes sent to a specific peer. Usually the leader member sends more data than other members since it is responsible for transmitting replicated data.
`peer_received_bytes_total` counts the total number of bytes received from a specific peer. Usually follower members receive data only from the leader member.
### gRPC requests
These metrics are exposed via [go-grpc-prometheus][go-grpc-prometheus].
## etcd_debugging namespace metrics
The metrics under the `etcd_debugging` prefix are for debugging. They are very implementation dependent and volatile. They might be changed or removed without any warning in new etcd releases. Some of the metrics might be moved to the `etcd` prefix when they become more stable.
## http requests
These metrics describe the serving of requests (non-watch events) served by etcd members in non-proxy mode: total
incoming requests, request failures and processing latency (inc. raft rounds for storage). They are useful for tracking
user-generated traffic hitting the etcd cluster .
All these metrics are prefixed with `etcd_http_`
| Name | Description | Type |
|--------------------------------|-----------------------------------------------------------------------------------------|--------------------|
| received_total | Total number of events after parsing and auth. | Counter(method) |
| failed_total | Total number of failed events.   | Counter(method,error) |
| successful_duration_second | Bucketed handling times of the requests, including raft rounds for writes. | Histogram(method) |
Example Prometheus queries that may be useful from these metrics (across all etcd members):
* `sum(rate(etcd_http_failed_total{job="etcd"}[1m]) by (method) / sum(rate(etcd_http_events_received_total{job="etcd"})[1m]) by (method)`
Shows the fraction of events that failed by HTTP method across all members, across a time window of `1m`.
* `sum(rate(etcd_http_received_total{job="etcd",method="GET})[1m]) by (method)`
`sum(rate(etcd_http_received_total{job="etcd",method~="GET})[1m]) by (method)`
Shows the rate of successful readonly/write queries across all servers, across a time window of `1m`.
* `histogram_quantile(0.9, sum(increase(etcd_http_successful_processing_seconds{job="etcd",method="GET"}[5m]) ) by (le))`
`histogram_quantile(0.9, sum(increase(etcd_http_successful_processing_seconds{job="etcd",method!="GET"}[5m]) ) by (le))`
Show the 0.90-tile latency (in seconds) of read/write (respectively) event handling across all members, with a window of `5m`.
## snapshot
### Snapshot
| Name | Description | Type |
|--------------------------------------------|------------------------------------------------------------|-----------|
| snapshot_save_total_durations_seconds | The total latency distributions of save called by snapshot | Histogram |
| snapshot_save_total_duration_seconds | The total latency distributions of save called by snapshot | Histogram |
Abnormally high snapshot duration (`snapshot_save_total_durations_seconds`) indicates disk issues and might cause the cluster to be unstable.
Abnormally high snapshot duration (`snapshot_save_total_duration_seconds`) indicates disk issues and might cause the cluster to be unstable.
## Prometheus supplied metrics
## rafthttp
The Prometheus client library provides a number of metrics under the `go` and `process` namespaces. There are a few that are particlarly interesting.
| Name | Description | Type | Labels |
|-----------------------------------|--------------------------------------------|--------------|--------------------------------|
| message_sent_latency_seconds | The latency distributions of messages sent | HistogramVec | sendingType, msgType, remoteID |
| message_sent_failed_total | The total number of failed messages sent | Summary | sendingType, msgType, remoteID |
| Name | Description | Type |
|-----------------------------------|--------------------------------------------|--------------|
| process_open_fds | Number of open file descriptors. | Gauge |
| process_max_fds | Maximum number of open file descriptors. | Gauge |
Heavy file descriptor (`process_open_fds`) usage (i.e., near the process's file descriptor limit, `process_max_fds`) indicates a potential file descriptor exhaustion issue. If the file descriptors are exhausted, etcd may panic because it cannot create new WAL files.
Abnormally high message duration (`message_sent_latency_seconds`) indicates network issues and might cause the cluster to be unstable.
An increase in message failures (`message_sent_failed_total`) indicates more severe network issues and might cause the cluster to be unstable.
Label `sendingType` is the connection type to send messages. `message`, `msgapp` and `msgappv2` use HTTP streaming, while `pipeline` does HTTP request for each message.
Label `msgType` is the type of raft message. `MsgApp` is log replication message; `MsgSnap` is snapshot install message; `MsgProp` is proposal forward message; the others are used to maintain raft internal status. If you have a large snapshot, you would expect a long msgSnap sending latency. For other types of messages, you would expect low latency, which is comparable to your ping latency if you have enough network bandwidth.
Label `remoteID` is the member ID of the message destination.
## proxy
etcd members operating in proxy mode do not do store operations. They forward all requests
to cluster instances.
Tracking the rate of requests coming from a proxy allows one to pin down which machine is performing most reads/writes.
All these metrics are prefixed with `etcd_proxy_`
| Name | Description | Type |
|---------------------------|-----------------------------------------------------------------------------------------|--------------------|
| requests_total | Total number of requests by this proxy instance. . | Counter(method) |
| handled_total | Total number of fully handled requests, with responses from etcd members. | Counter(method) |
| dropped_total | Total number of dropped requests due to forwarding errors to etcd members.  | Counter(method,error) |
| handling_duration_seconds | Bucketed handling times by HTTP method, including round trip to member instances. | Histogram(method) |
Example Prometheus queries that may be useful from these metrics (across all etcd servers):
* `sum(rate(etcd_proxy_handled_total{job="etcd"}[1m])) by (method)`
Rate of requests (by HTTP method) handled by all proxies, across a window of `1m`.
* `histogram_quantile(0.9, sum(increase(etcd_proxy_events_handling_time_seconds_bucket{job="etcd",method="GET"}[5m])) by (le))`
`histogram_quantile(0.9, sum(increase(etcd_proxy_events_handling_time_seconds_bucket{job="etcd",method!="GET"}[5m])) by (le))`
Show the 0.90-tile latency (in seconds) of handling of user requests across all proxy machines, with a window of `5m`.
* `sum(rate(etcd_proxy_dropped_total{job="etcd"}[1m])) by (proxying_error)`
Number of failed request on the proxy. This should be 0, spikes here indicate connectivity issues to etcd cluster.
[glossary-proposal]: glossary.md#proposal
[glossary-proposal]: learning/glossary.md#proposal
[prometheus]: http://prometheus.io/
[prometheus-getting-started](http://prometheus.io/docs/introduction/getting_started/)
[prometheus-getting-started]: http://prometheus.io/docs/introduction/getting_started/
[prometheus-naming]: http://prometheus.io/docs/practices/naming/
[v2-http-metrics]: v2/metrics.md#http-requests
[go-grpc-prometheus]: https://github.com/grpc-ecosystem/go-grpc-prometheus

View File

@ -0,0 +1,164 @@
# Authentication Guide
## Overview
Authentication was added in etcd 2.1. The etcd v3 API slightly modified the authentication feature's API and user interface to better fit the new data model. This guide is intended to help users set up basic authentication in etcd v3.
## Special users and roles
There is one special user, `root`, and one special role, `root`.
### User `root`
The `root` user, which has full access to etcd, must be created before activating authentication. The idea behind the `root` user is for administrative purposes: managing roles and ordinary users. The `root` user must have the `root` role and is allowed to change anything inside etcd.
### Role `root`
The role `root` may be granted to any user, in addition to the root user. A user with the `root` role has both global read-write access and permission to update the cluster's authentication configuration. Furthermore, the `root` role grants privileges for general cluster maintenance, including modifying cluster membership, defragmenting the store, and taking snapshots.
## Working with users
The `user` subcommand for `etcdctl` handles all things having to do with user accounts.
A listing of users can be found with:
```
$ etcdctl user list
```
Creating a user is as easy as
```
$ etcdctl user add myusername
```
Creating a new user will prompt for a new password. The password can be supplied from standard input when an option `--interactive=false` is given.
Roles can be granted and revoked for a user with:
```
$ etcdctl user grant-role myusername foo
$ etcdctl user revoke-role myusername bar
```
The user's settings can be inspected with:
```
$ etcdctl user get myusername
```
And the password for a user can be changed with
```
$ etcdctl user passwd myusername
```
Changing the password will prompt again for a new password. The password can be supplied from standard input when an option `--interactive=false` is given.
Delete an account with:
```
$ etcdctl user delete myusername
```
## Working with roles
The `role` subcommand for `etcdctl` handles all things having to do with access controls for particular roles, as were granted to individual users.
List roles with:
```
$ etcdctl role list
```
Create a new role with:
```
$ etcdctl role add myrolename
```
A role has no password; it merely defines a new set of access rights.
Roles are granted access to a single key or a range of keys.
The range can be specified as an interval [start-key, end-key) where start-key should be lexically less than end-key in an alphabetical manner.
Access can be granted as either read, write, or both, as in the following examples:
```
# Give read access to a key /foo
$ etcdctl role grant-permission myrolename read /foo
# Give read access to keys with a prefix /foo/. The prefix is equal to the range [/foo/, /foo0)
$ etcdctl role grant-permission myrolename --prefix=true read /foo/
# Give write-only access to the key at /foo/bar
$ etcdctl role grant-permission myrolename write /foo/bar
# Give full access to keys in a range of [key1, key5)
$ etcdctl role grant-permission myrolename readwrite key1 key5
# Give full access to keys with a prefix /pub/
$ etcdctl role grant-permission myrolename --prefix=true readwrite /pub/
```
To see what's granted, we can look at the role at any time:
```
$ etcdctl role get myrolename
```
Revocation of permissions is done the same logical way:
```
$ etcdctl role revoke-permission myrolename /foo/bar
```
As is removing a role entirely:
```
$ etcdctl role remove myrolename
```
## Enabling authentication
The minimal steps to enabling auth are as follows. The administrator can set up users and roles before or after enabling authentication, as a matter of preference.
Make sure the root user is created:
```
$ etcdctl user add root
Password of root:
```
Enable authentication:
```
$ etcdctl auth enable
```
After this, etcd is running with authentication enabled. To disable it for any reason, use the reciprocal command:
```
$ etcdctl --user root:rootpw auth disable
```
## Using `etcdctl` to authenticate
`etcdctl` supports a similar flag as `curl` for authentication.
```
$ etcdctl --user user:password get foo
```
The password can be taken from a prompt:
```
$ etcdctl --user user get foo
```
Otherwise, all `etcdctl` commands remain the same. Users and roles can still be created and modified, but require authentication by a user with the root role.
## Using TLS Common Name
If an etcd server is launched with the option `--client-cert-auth=true`, the field of Common Name (CN) in the client's TLS cert will be used as an etcd user. In this case, the common name authenticates the user and the client does not need a password.

View File

@ -0,0 +1,479 @@
# Clustering Guide
## Overview
Starting an etcd cluster statically requires that each member knows another in the cluster. In a number of cases, the IPs of the cluster members may be unknown ahead of time. In these cases, the etcd cluster can be bootstrapped with the help of a discovery service.
Once an etcd cluster is up and running, adding or removing members is done via [runtime reconfiguration][runtime-conf]. To better understand the design behind runtime reconfiguration, we suggest reading [the runtime configuration design document][runtime-reconf-design].
This guide will cover the following mechanisms for bootstrapping an etcd cluster:
* [Static](#static)
* [etcd Discovery](#etcd-discovery)
* [DNS Discovery](#dns-discovery)
Each of the bootstrapping mechanisms will be used to create a three machine etcd cluster with the following details:
|Name|Address|Hostname|
|------|---------|------------------|
|infra0|10.0.1.10|infra0.example.com|
|infra1|10.0.1.11|infra1.example.com|
|infra2|10.0.1.12|infra2.example.com|
## Static
As we know the cluster members, their addresses and the size of the cluster before starting, we can use an offline bootstrap configuration by setting the `initial-cluster` flag. Each machine will get either the following environment variables or command line:
```
ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380"
ETCD_INITIAL_CLUSTER_STATE=new
```
```
--initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
--initial-cluster-state new
```
Note that the URLs specified in `initial-cluster` are the _advertised peer URLs_, i.e. they should match the value of `initial-advertise-peer-urls` on the respective nodes.
If spinning up multiple clusters (or creating and destroying a single cluster) with same configuration for testing purpose, it is highly recommended that each cluster is given a unique `initial-cluster-token`. By doing this, etcd can generate unique cluster IDs and member IDs for the clusters even if they otherwise have the exact same configuration. This can protect etcd from cross-cluster-interaction, which might corrupt the clusters.
etcd listens on [`listen-client-urls`][conf-listen-client] to accept client traffic. etcd member advertises the URLs specified in [`advertise-client-urls`][conf-adv-client] to other members, proxies, clients. Please make sure the `advertise-client-urls` are reachable from intended clients. A common mistake is setting `advertise-client-urls` to localhost or leave it as default if the remote clients should reach etcd.
On each machine, start etcd with these flags:
```
$ etcd --name infra0 --initial-advertise-peer-urls http://10.0.1.10:2380 \
--listen-peer-urls http://10.0.1.10:2380 \
--listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.10:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
--initial-cluster-state new
```
```
$ etcd --name infra1 --initial-advertise-peer-urls http://10.0.1.11:2380 \
--listen-peer-urls http://10.0.1.11:2380 \
--listen-client-urls http://10.0.1.11:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.11:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
--initial-cluster-state new
```
```
$ etcd --name infra2 --initial-advertise-peer-urls http://10.0.1.12:2380 \
--listen-peer-urls http://10.0.1.12:2380 \
--listen-client-urls http://10.0.1.12:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.12:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
--initial-cluster-state new
```
The command line parameters starting with `--initial-cluster` will be ignored on subsequent runs of etcd. Feel free to remove the environment variables or command line flags after the initial bootstrap process. If the configuration needs changes later (for example, adding or removing members to/from the cluster), see the [runtime configuration][runtime-conf] guide.
### TLS
etcd supports encrypted communication through the TLS protocol. TLS channels can be used for encrypted internal cluster communication between peers as well as encrypted client traffic. This section provides examples for setting up a cluster with peer and client TLS. Additional information detailing etcd's TLS support can be found in the [security guide][security-guide].
#### Self-signed certificates
A cluster using self-signed certificates both encrypts traffic and authenticates its connections. To start a cluster with self-signed certificates, each cluster member should have a unique key pair (`member.crt`, `member.key`) signed by a shared cluster CA certificate (`ca.crt`) for both peer connections and client connections. Certificates may be generated by following the etcd [TLS setup][tls-setup] example.
On each machine, etcd would be started with these flags:
```
$ etcd --name infra0 --initial-advertise-peer-urls https://10.0.1.10:2380 \
--listen-peer-urls https://10.0.1.10:2380 \
--listen-client-urls https://10.0.1.10:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://10.0.1.10:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=https://10.0.1.10:2380,infra1=https://10.0.1.11:2380,infra2=https://10.0.1.12:2380 \
--initial-cluster-state new \
--client-cert-auth --trusted-ca-file=/path/to/ca-client.crt \
--cert-file=/path/to/infra0-client.crt --key-file=/path/to/infra0-client.key \
--peer-client-cert-auth --peer-trusted-ca-file=ca-peer.crt \
--peer-cert-file=/path/to/infra0-peer.crt --peer-key-file=/path/to/infra0-peer.key
```
```
$ etcd --name infra1 --initial-advertise-peer-urls https://10.0.1.11:2380 \
--listen-peer-urls https://10.0.1.11:2380 \
--listen-client-urls https://10.0.1.11:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://10.0.1.11:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=https://10.0.1.10:2380,infra1=https://10.0.1.11:2380,infra2=https://10.0.1.12:2380 \
--initial-cluster-state new \
--client-cert-auth --trusted-ca-file=/path/to/ca-client.crt \
--cert-file=/path/to/infra1-client.crt --key-file=/path/to/infra1-client.key \
--peer-client-cert-auth --peer-trusted-ca-file=ca-peer.crt \
--peer-cert-file=/path/to/infra1-peer.crt --peer-key-file=/path/to/infra1-peer.key
```
```
$ etcd --name infra2 --initial-advertise-peer-urls https://10.0.1.12:2380 \
--listen-peer-urls https://10.0.1.12:2380 \
--listen-client-urls https://10.0.1.12:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://10.0.1.12:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=https://10.0.1.10:2380,infra1=https://10.0.1.11:2380,infra2=https://10.0.1.12:2380 \
--initial-cluster-state new \
--client-cert-auth --trusted-ca-file=/path/to/ca-client.crt \
--cert-file=/path/to/infra2-client.crt --key-file=/path/to/infra2-client.key \
--peer-client-cert-auth --peer-trusted-ca-file=ca-peer.crt \
--peer-cert-file=/path/to/infra2-peer.crt --peer-key-file=/path/to/infra2-peer.key
```
#### Automatic certificates
If the cluster needs encrypted communication but does not require authenticated connections, etcd can be configured to automatically generate its keys. On initialization, each member creates its own set of keys based on its advertised IP addresses and hosts.
On each machine, etcd would be started with these flags:
```
$ etcd --name infra0 --initial-advertise-peer-urls https://10.0.1.10:2380 \
--listen-peer-urls https://10.0.1.10:2380 \
--listen-client-urls https://10.0.1.10:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://10.0.1.10:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=https://10.0.1.10:2380,infra1=https://10.0.1.11:2380,infra2=https://10.0.1.12:2380 \
--initial-cluster-state new \
--auto-tls \
--peer-auto-tls
```
```
$ etcd --name infra1 --initial-advertise-peer-urls https://10.0.1.11:2380 \
--listen-peer-urls https://10.0.1.11:2380 \
--listen-client-urls https://10.0.1.11:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://10.0.1.11:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=https://10.0.1.10:2380,infra1=https://10.0.1.11:2380,infra2=https://10.0.1.12:2380 \
--initial-cluster-state new \
--auto-tls \
--peer-auto-tls
```
```
$ etcd --name infra2 --initial-advertise-peer-urls https://10.0.1.12:2380 \
--listen-peer-urls https://10.0.1.12:2380 \
--listen-client-urls https://10.0.1.12:2379,https://127.0.0.1:2379 \
--advertise-client-urls https://10.0.1.12:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=https://10.0.1.10:2380,infra1=https://10.0.1.11:2380,infra2=https://10.0.1.12:2380 \
--initial-cluster-state new \
--auto-tls \
--peer-auto-tls
```
### Error cases
In the following example, we have not included our new host in the list of enumerated nodes. If this is a new cluster, the node _must_ be added to the list of initial cluster members.
```
$ etcd --name infra1 --initial-advertise-peer-urls http://10.0.1.11:2380 \
--listen-peer-urls https://10.0.1.11:2380 \
--listen-client-urls http://10.0.1.11:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.11:2379 \
--initial-cluster infra0=http://10.0.1.10:2380 \
--initial-cluster-state new
etcd: infra1 not listed in the initial cluster config
exit 1
```
In this example, we are attempting to map a node (infra0) on a different address (127.0.0.1:2380) than its enumerated address in the cluster list (10.0.1.10:2380). If this node is to listen on multiple addresses, all addresses _must_ be reflected in the "initial-cluster" configuration directive.
```
$ etcd --name infra0 --initial-advertise-peer-urls http://127.0.0.1:2380 \
--listen-peer-urls http://10.0.1.10:2380 \
--listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.10:2379 \
--initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
--initial-cluster-state=new
etcd: error setting up initial cluster: infra0 has different advertised URLs in the cluster and advertised peer URLs list
exit 1
```
If a peer is configured with a different set of configuration arguments and attempts to join this cluster, etcd will report a cluster ID mismatch will exit.
```
$ etcd --name infra3 --initial-advertise-peer-urls http://10.0.1.13:2380 \
--listen-peer-urls http://10.0.1.13:2380 \
--listen-client-urls http://10.0.1.13:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.13:2379 \
--initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra3=http://10.0.1.13:2380 \
--initial-cluster-state=new
etcd: conflicting cluster ID to the target cluster (c6ab534d07e8fcc4 != bc25ea2a74fb18b0). Exiting.
exit 1
```
## Discovery
In a number of cases, the IPs of the cluster peers may not be known ahead of time. This is common when utilizing cloud providers or when the network uses DHCP. In these cases, rather than specifying a static configuration, use an existing etcd cluster to bootstrap a new one. This process is called "discovery".
There two methods that can be used for discovery:
* etcd discovery service
* DNS SRV records
### etcd discovery
To better understand the design of the discovery service protocol, we suggest reading the discovery service protocol [documentation][discovery-proto].
#### Lifetime of a discovery URL
A discovery URL identifies a unique etcd cluster. Instead of reusing an existing discovery URL, each etcd instance shares a new discovery URL to bootstrap the new cluster.
Moreover, discovery URLs should ONLY be used for the initial bootstrapping of a cluster. To change cluster membership after the cluster is already running, see the [runtime reconfiguration][runtime-conf] guide.
#### Custom etcd discovery service
Discovery uses an existing cluster to bootstrap itself. If using a private etcd cluster, create a URL like so:
```
$ curl -X PUT https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83/_config/size -d value=3
```
By setting the size key to the URL, a discovery URL is created with an expected cluster size of 3.
The URL to use in this case will be `https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83` and the etcd members will use the `https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83` directory for registration as they start.
**Each member must have a different name flag specified. `Hostname` or `machine-id` can be a good choice. Or discovery will fail due to duplicated name.**
Now we start etcd with those relevant flags for each member:
```
$ etcd --name infra0 --initial-advertise-peer-urls http://10.0.1.10:2380 \
--listen-peer-urls http://10.0.1.10:2380 \
--listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.10:2379 \
--discovery https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83
```
```
$ etcd --name infra1 --initial-advertise-peer-urls http://10.0.1.11:2380 \
--listen-peer-urls http://10.0.1.11:2380 \
--listen-client-urls http://10.0.1.11:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.11:2379 \
--discovery https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83
```
```
$ etcd --name infra2 --initial-advertise-peer-urls http://10.0.1.12:2380 \
--listen-peer-urls http://10.0.1.12:2380 \
--listen-client-urls http://10.0.1.12:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.12:2379 \
--discovery https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83
```
This will cause each member to register itself with the custom etcd discovery service and begin the cluster once all machines have been registered.
#### Public etcd discovery service
If no exiting cluster is available, use the public discovery service hosted at `discovery.etcd.io`. To create a private discovery URL using the "new" endpoint, use the command:
```
$ curl https://discovery.etcd.io/new?size=3
https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
This will create the cluster with an initial size of 3 members. If no size is specified, a default of 3 is used.
```
ETCD_DISCOVERY=https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
```
--discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
**Each member must have a different name flag specified or else discovery will fail due to duplicated names. `Hostname` or `machine-id` can be a good choice. **
Now we start etcd with those relevant flags for each member:
```
$ etcd --name infra0 --initial-advertise-peer-urls http://10.0.1.10:2380 \
--listen-peer-urls http://10.0.1.10:2380 \
--listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.10:2379 \
--discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
```
$ etcd --name infra1 --initial-advertise-peer-urls http://10.0.1.11:2380 \
--listen-peer-urls http://10.0.1.11:2380 \
--listen-client-urls http://10.0.1.11:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.11:2379 \
--discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
```
$ etcd --name infra2 --initial-advertise-peer-urls http://10.0.1.12:2380 \
--listen-peer-urls http://10.0.1.12:2380 \
--listen-client-urls http://10.0.1.12:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.12:2379 \
--discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
This will cause each member to register itself with the discovery service and begin the cluster once all members have been registered.
Use the environment variable `ETCD_DISCOVERY_PROXY` to cause etcd to use an HTTP proxy to connect to the discovery service.
#### Error and warning cases
##### Discovery server errors
```
$ etcd --name infra0 --initial-advertise-peer-urls http://10.0.1.10:2380 \
--listen-peer-urls http://10.0.1.10:2380 \
--listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.10:2379 \
--discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
etcd: error: the cluster doesnt have a size configuration value in https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de/_config
exit 1
```
##### Warnings
This is a harmless warning indicating the discovery URL will be ignored on this machine.
```
$ etcd --name infra0 --initial-advertise-peer-urls http://10.0.1.10:2380 \
--listen-peer-urls http://10.0.1.10:2380 \
--listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.0.1.10:2379 \
--discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
etcdserver: discovery token ignored since a cluster has already been initialized. Valid log found at /var/lib/etcd
```
### DNS discovery
DNS [SRV records][rfc-srv] can be used as a discovery mechanism.
The `-discovery-srv` flag can be used to set the DNS domain name where the discovery SRV records can be found.
The following DNS SRV records are looked up in the listed order:
* _etcd-server-ssl._tcp.example.com
* _etcd-server._tcp.example.com
If `_etcd-server-ssl._tcp.example.com` is found then etcd will attempt the bootstrapping process over TLS.
To help clients discover the etcd cluster, the following DNS SRV records are looked up in the listed order:
* _etcd-client._tcp.example.com
* _etcd-client-ssl._tcp.example.com
If `_etcd-client-ssl._tcp.example.com` is found, clients will attempt to communicate with the etcd cluster over SSL/TLS.
If etcd is using TLS without a custom certificate authority, the discovery domain (e.g., example.com) must match the SRV record domain (e.g., infra1.example.com). This is to mitigate attacks that forge SRV records to point to a different domain; the domain would have a valid certificate under PKI but be controlled by an unknown third party.
#### Create DNS SRV records
```
$ dig +noall +answer SRV _etcd-server._tcp.example.com
_etcd-server._tcp.example.com. 300 IN SRV 0 0 2380 infra0.example.com.
_etcd-server._tcp.example.com. 300 IN SRV 0 0 2380 infra1.example.com.
_etcd-server._tcp.example.com. 300 IN SRV 0 0 2380 infra2.example.com.
```
```
$ dig +noall +answer SRV _etcd-client._tcp.example.com
_etcd-client._tcp.example.com. 300 IN SRV 0 0 2379 infra0.example.com.
_etcd-client._tcp.example.com. 300 IN SRV 0 0 2379 infra1.example.com.
_etcd-client._tcp.example.com. 300 IN SRV 0 0 2379 infra2.example.com.
```
```
$ dig +noall +answer infra0.example.com infra1.example.com infra2.example.com
infra0.example.com. 300 IN A 10.0.1.10
infra1.example.com. 300 IN A 10.0.1.11
infra2.example.com. 300 IN A 10.0.1.12
```
#### Bootstrap the etcd cluster using DNS
etcd cluster members can listen on domain names or IP address, the bootstrap process will resolve DNS A records.
The resolved address in `--initial-advertise-peer-urls` *must match* one of the resolved addresses in the SRV targets. The etcd member reads the resolved address to find out if it belongs to the cluster defined in the SRV records.
```
$ etcd --name infra0 \
--discovery-srv example.com \
--initial-advertise-peer-urls http://infra0.example.com:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster-state new \
--advertise-client-urls http://infra0.example.com:2379 \
--listen-client-urls http://infra0.example.com:2379 \
--listen-peer-urls http://infra0.example.com:2380
```
```
$ etcd --name infra1 \
--discovery-srv example.com \
--initial-advertise-peer-urls http://infra1.example.com:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster-state new \
--advertise-client-urls http://infra1.example.com:2379 \
--listen-client-urls http://infra1.example.com:2379 \
--listen-peer-urls http://infra1.example.com:2380
```
```
$ etcd --name infra2 \
--discovery-srv example.com \
--initial-advertise-peer-urls http://infra2.example.com:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster-state new \
--advertise-client-urls http://infra2.example.com:2379 \
--listen-client-urls http://infra2.example.com:2379 \
--listen-peer-urls http://infra2.example.com:2380
```
The cluster can also bootstrap using IP addresses instead of domain names:
```
$ etcd --name infra0 \
--discovery-srv example.com \
--initial-advertise-peer-urls http://10.0.1.10:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster-state new \
--advertise-client-urls http://10.0.1.10:2379 \
--listen-client-urls http://10.0.1.10:2379 \
--listen-peer-urls http://10.0.1.10:2380
```
```
$ etcd --name infra1 \
--discovery-srv example.com \
--initial-advertise-peer-urls http://10.0.1.11:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster-state new \
--advertise-client-urls http://10.0.1.11:2379 \
--listen-client-urls http://10.0.1.11:2379 \
--listen-peer-urls http://10.0.1.11:2380
```
```
$ etcd --name infra2 \
--discovery-srv example.com \
--initial-advertise-peer-urls http://10.0.1.12:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster-state new \
--advertise-client-urls http://10.0.1.12:2379 \
--listen-client-urls http://10.0.1.12:2379 \
--listen-peer-urls http://10.0.1.12:2380
```
### Gateway
etcd gateway is a simple TCP proxy that forwards network data to the etcd cluster. Please read [gateway guide][gateway] for more information.
### Proxy
When the `--proxy` flag is set, etcd runs in [proxy mode][proxy]. This proxy mode only supports the etcd v2 API; there are no plans to support the v3 API. Instead, for v3 API support, there will be a new proxy with enhanced features following the etcd 3.0 release.
To setup an etcd cluster with proxies of v2 API, please read the the [clustering doc in etcd 2.3 release][clustering_etcd2].
[conf-adv-client]: configuration.md#--advertise-client-urls
[conf-listen-client]: configuration.md#--listen-client-urls
[discovery-proto]: ../dev-internal/discovery_protocol.md
[rfc-srv]: http://www.ietf.org/rfc/rfc2052.txt
[runtime-conf]: runtime-configuration.md
[runtime-reconf-design]: runtime-reconf-design.md
[proxy]: https://github.com/coreos/etcd/blob/release-2.3/Documentation/proxy.md
[clustering_etcd2]: https://github.com/coreos/etcd/blob/release-2.3/Documentation/clustering.md
[security-guide]: security.md
[tls-setup]: ../../hack/tls-setup
[gateway]: gateway.md

View File

@ -0,0 +1,313 @@
# Configuration flags
etcd is configurable through command-line flags and environment variables. Options set on the command line take precedence over those from the environment.
The format of environment variable for flag `--my-flag` is `ETCD_MY_FLAG`. It applies to all flags.
The [official etcd ports][iana-ports] are 2379 for client requests and 2380 for peer communication. The etcd ports can be set to accept TLS traffic, non-TLS traffic, or both TLS and non-TLS traffic.
To start etcd automatically using custom settings at startup in Linux, using a [systemd][systemd-intro] unit is highly recommended.
## Member flags
### --name
+ Human-readable name for this member.
+ default: "default"
+ env variable: ETCD_NAME
+ This value is referenced as this node's own entries listed in the `--initial-cluster` flag (e.g., `default=http://localhost:2380`). This needs to match the key used in the flag if using [static bootstrapping][build-cluster]. When using discovery, each member must have a unique name. `Hostname` or `machine-id` can be a good choice.
### --data-dir
+ Path to the data directory.
+ default: "${name}.etcd"
+ env variable: ETCD_DATA_DIR
### --wal-dir
+ Path to the dedicated wal directory. If this flag is set, etcd will write the WAL files to the walDir rather than the dataDir. This allows a dedicated disk to be used, and helps avoid io competition between logging and other IO operations.
+ default: ""
+ env variable: ETCD_WAL_DIR
### --snapshot-count
+ Number of committed transactions to trigger a snapshot to disk.
+ default: "100000"
+ env variable: ETCD_SNAPSHOT_COUNT
### --heartbeat-interval
+ Time (in milliseconds) of a heartbeat interval.
+ default: "100"
+ env variable: ETCD_HEARTBEAT_INTERVAL
### --election-timeout
+ Time (in milliseconds) for an election to timeout. See [Documentation/tuning.md][tuning] for details.
+ default: "1000"
+ env variable: ETCD_ELECTION_TIMEOUT
### --listen-peer-urls
+ List of URLs to listen on for peer traffic. This flag tells the etcd to accept incoming requests from its peers on the specified scheme://IP:port combinations. Scheme can be either http or https.If 0.0.0.0 is specified as the IP, etcd listens to the given port on all interfaces. If an IP address is given as well as a port, etcd will listen on the given port and interface. Multiple URLs may be used to specify a number of addresses and ports to listen on. The etcd will respond to requests from any of the listed addresses and ports.
+ default: "http://localhost:2380"
+ env variable: ETCD_LISTEN_PEER_URLS
+ example: "http://10.0.0.1:2380"
+ invalid example: "http://example.com:2380" (domain name is invalid for binding)
### --listen-client-urls
+ List of URLs to listen on for client traffic. This flag tells the etcd to accept incoming requests from the clients on the specified scheme://IP:port combinations. Scheme can be either http or https. If 0.0.0.0 is specified as the IP, etcd listens to the given port on all interfaces. If an IP address is given as well as a port, etcd will listen on the given port and interface. Multiple URLs may be used to specify a number of addresses and ports to listen on. The etcd will respond to requests from any of the listed addresses and ports.
+ default: "http://localhost:2379"
+ env variable: ETCD_LISTEN_CLIENT_URLS
+ example: "http://10.0.0.1:2379"
+ invalid example: "http://example.com:2379" (domain name is invalid for binding)
### --max-snapshots
+ Maximum number of snapshot files to retain (0 is unlimited)
+ default: 5
+ env variable: ETCD_MAX_SNAPSHOTS
+ The default for users on Windows is unlimited, and manual purging down to 5 (or some preference for safety) is recommended.
### --max-wals
+ Maximum number of wal files to retain (0 is unlimited)
+ default: 5
+ env variable: ETCD_MAX_WALS
+ The default for users on Windows is unlimited, and manual purging down to 5 (or some preference for safety) is recommended.
### --cors
+ Comma-separated white list of origins for CORS (cross-origin resource sharing).
+ default: none
+ env variable: ETCD_CORS
## Clustering flags
`--initial` prefix flags are used in bootstrapping ([static bootstrap][build-cluster], [discovery-service bootstrap][discovery] or [runtime reconfiguration][reconfig]) a new member, and ignored when restarting an existing member.
`--discovery` prefix flags need to be set when using [discovery service][discovery].
### --initial-advertise-peer-urls
+ List of this member's peer URLs to advertise to the rest of the cluster. These addresses are used for communicating etcd data around the cluster. At least one must be routable to all cluster members. These URLs can contain domain names.
+ default: "http://localhost:2380"
+ env variable: ETCD_INITIAL_ADVERTISE_PEER_URLS
+ example: "http://example.com:2380, http://10.0.0.1:2380"
### --initial-cluster
+ Initial cluster configuration for bootstrapping.
+ default: "default=http://localhost:2380"
+ env variable: ETCD_INITIAL_CLUSTER
+ The key is the value of the `--name` flag for each node provided. The default uses `default` for the key because this is the default for the `--name` flag.
### --initial-cluster-state
+ Initial cluster state ("new" or "existing"). Set to `new` for all members present during initial static or DNS bootstrapping. If this option is set to `existing`, etcd will attempt to join the existing cluster. If the wrong value is set, etcd will attempt to start but fail safely.
+ default: "new"
+ env variable: ETCD_INITIAL_CLUSTER_STATE
[static bootstrap]: clustering.md#static
### --initial-cluster-token
+ Initial cluster token for the etcd cluster during bootstrap.
+ default: "etcd-cluster"
+ env variable: ETCD_INITIAL_CLUSTER_TOKEN
### --advertise-client-urls
+ List of this member's client URLs to advertise to the rest of the cluster. These URLs can contain domain names.
+ default: "http://localhost:2379"
+ env variable: ETCD_ADVERTISE_CLIENT_URLS
+ example: "http://example.com:2379, http://10.0.0.1:2379"
+ Be careful if advertising URLs such as http://localhost:2379 from a cluster member and are using the proxy feature of etcd. This will cause loops, because the proxy will be forwarding requests to itself until its resources (memory, file descriptors) are eventually depleted.
### --discovery
+ Discovery URL used to bootstrap the cluster.
+ default: none
+ env variable: ETCD_DISCOVERY
### --discovery-srv
+ DNS srv domain used to bootstrap the cluster.
+ default: none
+ env variable: ETCD_DISCOVERY_SRV
### --discovery-fallback
+ Expected behavior ("exit" or "proxy") when discovery services fails. "proxy" supports v2 API only.
+ default: "proxy"
+ env variable: ETCD_DISCOVERY_FALLBACK
### --discovery-proxy
+ HTTP proxy to use for traffic to discovery service.
+ default: none
+ env variable: ETCD_DISCOVERY_PROXY
### --strict-reconfig-check
+ Reject reconfiguration requests that would cause quorum loss.
+ default: false
+ env variable: ETCD_STRICT_RECONFIG_CHECK
### --auto-compaction-retention
+ Auto compaction retention for mvcc key value store in hour. 0 means disable auto compaction.
+ default: 0
+ env variable: ETCD_AUTO_COMPACTION_RETENTION
### --enable-v2
+ Accept etcd V2 client requests
+ default: true
+ env variable: ETCD_ENABLE_V2
## Proxy flags
`--proxy` prefix flags configures etcd to run in [proxy mode][proxy]. "proxy" supports v2 API only.
### --proxy
+ Proxy mode setting ("off", "readonly" or "on").
+ default: "off"
+ env variable: ETCD_PROXY
### --proxy-failure-wait
+ Time (in milliseconds) an endpoint will be held in a failed state before being reconsidered for proxied requests.
+ default: 5000
+ env variable: ETCD_PROXY_FAILURE_WAIT
### --proxy-refresh-interval
+ Time (in milliseconds) of the endpoints refresh interval.
+ default: 30000
+ env variable: ETCD_PROXY_REFRESH_INTERVAL
### --proxy-dial-timeout
+ Time (in milliseconds) for a dial to timeout or 0 to disable the timeout
+ default: 1000
+ env variable: ETCD_PROXY_DIAL_TIMEOUT
### --proxy-write-timeout
+ Time (in milliseconds) for a write to timeout or 0 to disable the timeout.
+ default: 5000
+ env variable: ETCD_PROXY_WRITE_TIMEOUT
### --proxy-read-timeout
+ Time (in milliseconds) for a read to timeout or 0 to disable the timeout.
+ Don't change this value if using watches because use long polling requests.
+ default: 0
+ env variable: ETCD_PROXY_READ_TIMEOUT
## Security flags
The security flags help to [build a secure etcd cluster][security].
### --ca-file
**DEPRECATED**
+ Path to the client server TLS CA file. `--ca-file ca.crt` could be replaced by `--trusted-ca-file ca.crt --client-cert-auth` and etcd will perform the same.
+ default: none
+ env variable: ETCD_CA_FILE
### --cert-file
+ Path to the client server TLS cert file.
+ default: none
+ env variable: ETCD_CERT_FILE
### --key-file
+ Path to the client server TLS key file.
+ default: none
+ env variable: ETCD_KEY_FILE
### --client-cert-auth
+ Enable client cert authentication.
+ default: false
+ env variable: ETCD_CLIENT_CERT_AUTH
### --trusted-ca-file
+ Path to the client server TLS trusted CA key file.
+ default: none
+ env variable: ETCD_TRUSTED_CA_FILE
### --auto-tls
+ Client TLS using generated certificates
+ default: false
+ env variable: ETCD_AUTO_TLS
### --peer-ca-file
**DEPRECATED**
+ Path to the peer server TLS CA file. `--peer-ca-file ca.crt` could be replaced by `--peer-trusted-ca-file ca.crt --peer-client-cert-auth` and etcd will perform the same.
+ default: none
+ env variable: ETCD_PEER_CA_FILE
### --peer-cert-file
+ Path to the peer server TLS cert file.
+ default: none
+ env variable: ETCD_PEER_CERT_FILE
### --peer-key-file
+ Path to the peer server TLS key file.
+ default: none
+ env variable: ETCD_PEER_KEY_FILE
### --peer-client-cert-auth
+ Enable peer client cert authentication.
+ default: false
+ env variable: ETCD_PEER_CLIENT_CERT_AUTH
### --peer-trusted-ca-file
+ Path to the peer server TLS trusted CA file.
+ default: none
+ env variable: ETCD_PEER_TRUSTED_CA_FILE
### --peer-auto-tls
+ Peer TLS using generated certificates
+ default: false
+ env variable: ETCD_PEER_AUTO_TLS
## Logging flags
### --debug
+ Drop the default log level to DEBUG for all subpackages.
+ default: false (INFO for all packages)
+ env variable: ETCD_DEBUG
### --log-package-levels
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ default: none (INFO for all packages)
+ env variable: ETCD_LOG_PACKAGE_LEVELS
## Unsafe flags
Please be CAUTIOUS when using unsafe flags because it will break the guarantees given by the consensus protocol.
For example, it may panic if other members in the cluster are still alive.
Follow the instructions when using these flags.
### --force-new-cluster
+ Force to create a new one-member cluster. It commits configuration changes forcing to remove all existing members in the cluster and add itself. It needs to be set to [restore a backup][restore].
+ default: false
+ env variable: ETCD_FORCE_NEW_CLUSTER
## Miscellaneous flags
### --version
+ Print the version and exit.
+ default: false
### --config-file
+ Load server configuration from a file.
+ default: none
## Profiling flags
### --enable-pprof
+ Enable runtime profiling data via HTTP server. Address is at client URL + "/debug/pprof/"
+ default: false
### --metrics
+ Set level of detail for exported metrics, specify 'extensive' to include histogram metrics.
+ default: basic
## Auth flags
### --auth-token
+ Specify a token type and token specific options, especially for JWT. Its format is "type,var1=val1,var2=val2,...". Possible type is 'simple' or 'jwt'. Possible variables are 'sign-method' for specifying a sign method of jwt (its possible values are 'ES256', 'ES384', 'ES512', 'HS256', 'HS384', 'HS512', 'RS256', 'RS384', 'RS512', 'PS256', 'PS384', or 'PS512'), 'pub-key' for specifying a path to a public key for verifying jwt, and 'priv-key' for specifying a path to a private key for signing jwt.
+ Example option of JWT: '--auth-token jwt,pub-key=app.rsa.pub,priv-key=app.rsa,sign-method=RS512'
+ default: "simple"
[build-cluster]: clustering.md#static
[reconfig]: runtime-configuration.md
[discovery]: clustering.md#discovery
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
[proxy]: ../v2/proxy.md
[restore]: ../v2/admin_guide.md#restoring-a-backup
[security]: security.md
[systemd-intro]: http://freedesktop.org/wiki/Software/systemd/
[tuning]: ../tuning.md#time-parameters

View File

@ -0,0 +1,196 @@
# Run etcd clusters inside containers
The following guide shows how to run etcd with rkt and Docker using the [static bootstrap process](clustering.md#static).
## rkt
### Running a single node etcd
The following rkt run command will expose the etcd client API on port 2379 and expose the peer API on port 2380.
Use the host IP address when configuring etcd.
```
export NODE1=192.168.1.21
```
Trust the CoreOS [App Signing Key](https://coreos.com/security/app-signing-key/).
```
sudo rkt trust --prefix coreos.com/etcd
# gpg key fingerprint is: 18AD 5014 C99E F7E3 BA5F 6CE9 50BD D3E0 FC8A 365E
```
Run the `v3.1.2` version of etcd or specify another release version.
```
sudo rkt run --net=default:IP=${NODE1} coreos.com/etcd:v3.1.2 -- -name=node1 -advertise-client-urls=http://${NODE1}:2379 -initial-advertise-peer-urls=http://${NODE1}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE1}:2380 -initial-cluster=node1=http://${NODE1}:2380
```
List the cluster member.
```
etcdctl --endpoints=http://192.168.1.21:2379 member list
```
### Running a 3 node etcd cluster
Setup a 3 node cluster with rkt locally, using the `-initial-cluster` flag.
```sh
export NODE1=172.16.28.21
export NODE2=172.16.28.22
export NODE3=172.16.28.23
```
```
# node 1
sudo rkt run --net=default:IP=${NODE1} coreos.com/etcd:v3.1.2 -- -name=node1 -advertise-client-urls=http://${NODE1}:2379 -initial-advertise-peer-urls=http://${NODE1}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE1}:2380 -initial-cluster=node1=http://${NODE1}:2380,node2=http://${NODE2}:2380,node3=http://${NODE3}:2380
# node 2
sudo rkt run --net=default:IP=${NODE2} coreos.com/etcd:v3.1.2 -- -name=node2 -advertise-client-urls=http://${NODE2}:2379 -initial-advertise-peer-urls=http://${NODE2}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE2}:2380 -initial-cluster=node1=http://${NODE1}:2380,node2=http://${NODE2}:2380,node3=http://${NODE3}:2380
# node 3
sudo rkt run --net=default:IP=${NODE3} coreos.com/etcd:v3.1.2 -- -name=node3 -advertise-client-urls=http://${NODE3}:2379 -initial-advertise-peer-urls=http://${NODE3}:2380 -listen-client-urls=http://0.0.0.0:2379 -listen-peer-urls=http://${NODE3}:2380 -initial-cluster=node1=http://${NODE1}:2380,node2=http://${NODE2}:2380,node3=http://${NODE3}:2380
```
Verify the cluster is healthy and can be reached.
```
ETCDCTL_API=3 etcdctl --endpoints=http://172.16.28.21:2379,http://172.16.28.22:2379,http://172.16.28.23:2379 endpoint health
```
### DNS
Production clusters which refer to peers by DNS name known to the local resolver must mount the [host's DNS configuration](https://coreos.com/kubernetes/docs/latest/kubelet-wrapper.html#customizing-rkt-options).
## Docker
In order to expose the etcd API to clients outside of Docker host, use the host IP address of the container. Please see [`docker inspect`](https://docs.docker.com/engine/reference/commandline/inspect) for more detail on how to get the IP address. Alternatively, specify `--net=host` flag to `docker run` command to skip placing the container inside of a separate network stack.
### Running a single node etcd
Use the host IP address when configuring etcd:
```
export NODE1=192.168.1.21
```
Run the latest version of etcd:
```
docker run \
-p 2379:2379 \
-p 2380:2380 \
--volume=${DATA_DIR}:/etcd-data \
--name etcd quay.io/coreos/etcd:latest \
/usr/local/bin/etcd \
--data-dir=/etcd-data --name node1 \
--initial-advertise-peer-urls http://${NODE1}:2380 --listen-peer-urls http://${NODE1}:2380 \
--advertise-client-urls http://${NODE1}:2379 --listen-client-urls http://${NODE1}:2379 \
--initial-cluster node1=http://${NODE1}:2380
```
List the cluster member:
```
etcdctl --endpoints=http://${NODE1}:2379 member list
```
### Running a 3 node etcd cluster
```
# For each machine
ETCD_VERSION=latest
TOKEN=my-etcd-token
CLUSTER_STATE=new
NAME_1=etcd-node-0
NAME_2=etcd-node-1
NAME_3=etcd-node-2
HOST_1=10.20.30.1
HOST_2=10.20.30.2
HOST_3=10.20.30.3
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380
DATA_DIR=/var/lib/etcd
# For node 1
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
docker run \
-p 2379:2379 \
-p 2380:2380 \
--volume=${DATA_DIR}:/etcd-data \
--name etcd quay.io/coreos/etcd:${ETCD_VERSION} \
/usr/local/bin/etcd \
--data-dir=/etcd-data --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
# For node 2
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
docker run \
-p 2379:2379 \
-p 2380:2380 \
--volume=${DATA_DIR}:/etcd-data \
--name etcd quay.io/coreos/etcd:${ETCD_VERSION} \
/usr/local/bin/etcd \
--data-dir=/etcd-data --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
# For node 3
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
docker run \
-p 2379:2379 \
-p 2380:2380 \
--volume=${DATA_DIR}:/etcd-data \
--name etcd quay.io/coreos/etcd:${ETCD_VERSION} \
/usr/local/bin/etcd \
--data-dir=/etcd-data --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
```
To run `etcdctl` using API version 3:
```
docker exec etcd /bin/sh -c "export ETCDCTL_API=3 && /usr/local/bin/etcdctl put foo bar"
```
## Bare Metal
To provision a 3 node etcd cluster on bare-metal, the examples in the [baremetal repo](https://github.com/coreos/coreos-baremetal/tree/master/examples) may be useful.
## Mounting a certificate volume
The etcd release container does not include default root certificates. To use HTTPS with certificates trusted by a root authority (e.g., for discovery), mount a certificate directory into the etcd container:
```
rkt run \
--volume etcd-ssl-certs-bundle,kind=host,source=/etc/ssl/certs/ca-certificates.crt \
--mount volume=etcd-ssl-certs-bundle,target=/etc/ssl/certs/ca-certificates.crt \
quay.io/coreos/etcd:latest -- --name my-name \
--initial-advertise-peer-urls http://localhost:2380 --listen-peer-urls http://localhost:2380 \
--advertise-client-urls http://localhost:2379 --listen-client-urls http://localhost:2379 \
--discovery https://discovery.etcd.io/c11fbcdc16972e45253491a24fcf45e1
```
```
docker run \
-p 2379:2379 \
-p 2380:2380 \
--volume=/etc/ssl/certs/ca-certificates.crt:/etc/ssl/certs/ca-certificates.crt \
quay.io/coreos/etcd:latest \
/usr/local/bin/etcd --name my-name \
--initial-advertise-peer-urls http://localhost:2380 --listen-peer-urls http://localhost:2380 \
--advertise-client-urls http://localhost:2379 --listen-client-urls http://localhost:2379 \
--discovery https://discovery.etcd.io/86a9ff6c8cb8b4c4544c1a2f88f8b801
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

View File

@ -0,0 +1,206 @@
# general cluster availability
# alert if another failed member will result in an unavailable cluster
ALERT InsufficientMembers
IF count(up{job="etcd"} == 0) > (count(up{job="etcd"}) / 2 - 1)
FOR 3m
LABELS {
severity = "critical"
}
ANNOTATIONS {
summary = "etcd cluster insufficient members",
description = "If one more etcd member goes down the cluster will be unavailable",
}
# etcd leader alerts
# ==================
# alert if any etcd instance has no leader
ALERT NoLeader
IF etcd_server_has_leader{job="etcd"} == 0
FOR 1m
LABELS {
severity = "critical"
}
ANNOTATIONS {
summary = "etcd member has no leader",
description = "etcd member {{ $labels.instance }} has no leader",
}
# alert if there are lots of leader changes
ALERT HighNumberOfLeaderChanges
IF increase(etcd_server_leader_changes_seen_total{job="etcd"}[1h]) > 3
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "a high number of leader changes within the etcd cluster are happening",
description = "etcd instance {{ $labels.instance }} has seen {{ $value }} leader changes within the last hour",
}
# gRPC request alerts
# ===================
# alert if more than 1% of gRPC method calls have failed within the last 5 minutes
ALERT HighNumberOfFailedGRPCRequests
IF sum by(grpc_method) (rate(etcd_grpc_requests_failed_total{job="etcd"}[5m]))
/ sum by(grpc_method) (rate(etcd_grpc_total{job="etcd"}[5m])) > 0.01
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "a high number of gRPC requests are failing",
description = "{{ $value }}% of requests for {{ $labels.grpc_method }} failed on etcd instance {{ $labels.instance }}",
}
# alert if more than 5% of gRPC method calls have failed within the last 5 minutes
ALERT HighNumberOfFailedGRPCRequests
IF sum by(grpc_method) (rate(etcd_grpc_requests_failed_total{job="etcd"}[5m]))
/ sum by(grpc_method) (rate(etcd_grpc_total{job="etcd"}[5m])) > 0.05
FOR 5m
LABELS {
severity = "critical"
}
ANNOTATIONS {
summary = "a high number of gRPC requests are failing",
description = "{{ $value }}% of requests for {{ $labels.grpc_method }} failed on etcd instance {{ $labels.instance }}",
}
# alert if the 99th percentile of gRPC method calls take more than 150ms
ALERT GRPCRequestsSlow
IF histogram_quantile(0.99, rate(etcd_grpc_unary_requests_duration_seconds_bucket[5m])) > 0.15
FOR 10m
LABELS {
severity = "critical"
}
ANNOTATIONS {
summary = "slow gRPC requests",
description = "on etcd instance {{ $labels.instance }} gRPC requests to {{ $label.grpc_method }} are slow",
}
# HTTP requests alerts
# ====================
# alert if more than 1% of requests to an HTTP endpoint have failed within the last 5 minutes
ALERT HighNumberOfFailedHTTPRequests
IF sum by(method) (rate(etcd_http_failed_total{job="etcd"}[5m]))
/ sum by(method) (rate(etcd_http_received_total{job="etcd"}[5m])) > 0.01
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "a high number of HTTP requests are failing",
description = "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}",
}
# alert if more than 5% of requests to an HTTP endpoint have failed within the last 5 minutes
ALERT HighNumberOfFailedHTTPRequests
IF sum by(method) (rate(etcd_http_failed_total{job="etcd"}[5m]))
/ sum by(method) (rate(etcd_http_received_total{job="etcd"}[5m])) > 0.05
FOR 5m
LABELS {
severity = "critical"
}
ANNOTATIONS {
summary = "a high number of HTTP requests are failing",
description = "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}",
}
# alert if the 99th percentile of HTTP requests take more than 150ms
ALERT HTTPRequestsSlow
IF histogram_quantile(0.99, rate(etcd_http_successful_duration_seconds_bucket[5m])) > 0.15
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "slow HTTP requests",
description = "on etcd instance {{ $labels.instance }} HTTP requests to {{ $label.method }} are slow",
}
# file descriptor alerts
# ======================
instance:fd_utilization = process_open_fds / process_max_fds
# alert if file descriptors are likely to exhaust within the next 4 hours
ALERT FdExhaustionClose
IF predict_linear(instance:fd_utilization[1h], 3600 * 4) > 1
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "file descriptors soon exhausted",
description = "{{ $labels.job }} instance {{ $labels.instance }} will exhaust its file descriptors soon",
}
# alert if file descriptors are likely to exhaust within the next hour
ALERT FdExhaustionClose
IF predict_linear(instance:fd_utilization[10m], 3600) > 1
FOR 10m
LABELS {
severity = "critical"
}
ANNOTATIONS {
summary = "file descriptors soon exhausted",
description = "{{ $labels.job }} instance {{ $labels.instance }} will exhaust its file descriptors soon",
}
# etcd member communication alerts
# ================================
# alert if 99th percentile of round trips take 150ms
ALERT EtcdMemberCommunicationSlow
IF histogram_quantile(0.99, rate(etcd_network_member_round_trip_time_seconds_bucket[5m])) > 0.15
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "etcd member communication is slow",
description = "etcd instance {{ $labels.instance }} member communication with {{ $label.To }} is slow",
}
# etcd proposal alerts
# ====================
# alert if there are several failed proposals within an hour
ALERT HighNumberOfFailedProposals
IF increase(etcd_server_proposals_failed_total{job="etcd"}[1h]) > 5
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "a high number of proposals within the etcd cluster are failing",
description = "etcd instance {{ $labels.instance }} has seen {{ $value }} proposal failures within the last hour",
}
# etcd disk io latency alerts
# ===========================
# alert if 99th percentile of fsync durations is higher than 500ms
ALERT HighFsyncDurations
IF histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m])) > 0.5
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "high fsync durations",
description = "etcd instance {{ $labels.instance }} fync durations are high",
}
# alert if 99th percentile of commit durations is higher than 250ms
ALERT HighCommitDurations
IF histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) > 0.25
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "high commit durations",
description = "etcd instance {{ $labels.instance }} commit durations are high",
}

View File

@ -0,0 +1,44 @@
# Understand failures
Failures are common in a large deployment of machines. A machine fails when its hardware or software malfunctions. Multiple machines fail together when there are power failures or network issues. Multiple kinds of failures can also happen at once; it is almost impossible to enumerate all possible failure cases.
In this section, we catalog kinds of failures and discuss how etcd is designed to tolerate these failures. Most users, if not all, can map a particular failure into one kind of failure. To prepare for rare or [unrecoverable failures][unrecoverable], always [back up][backup] the etcd cluster.
## Minor followers failure
When fewer than half of the followers fail, the etcd cluster can still accept requests and make progress without any major disruption. For example, two follower failures will not affect a five member etcd clusters operation. However, clients will lose connectivity to the failed members. Client libraries should hide these interruptions from users for read requests by automatically reconnecting to other members. Operators should expect the system load on the other members to increase due to the reconnections.
## Leader failure
When a leader fails, the etcd cluster automatically elects a new leader. The election does not happen instantly once the leader fails. It takes about an election timeout to elect a new leader since the failure detection model is timeout based.
During the leader election the cluster cannot process any writes. Write requests sent during the election are queued for processing until a new leader is elected.
Writes already sent to the old leader but not yet committed may be lost. The new leader has the power to rewrite any uncommitted entries from the previous leader. From the user perspective, some write requests might time out after a new leader election. However, no committed writes are ever lost.
The new leader extends timeouts automatically for all leases. This mechanism ensures a lease will not expire before the granted TTL even if it was granted by the old leader.
## Majority failure
When the majority members of the cluster fail, the etcd cluster fails and cannot accept more writes.
The etcd cluster can only recover from a majority failure once the majority of members become available. If a majority of members cannot come back online, then the operator must start [disaster recovery][unrecoverable] to recover the cluster.
Once a majority of members works, the etcd cluster elects a new leader automatically and returns to a healthy state. The new leader extends timeouts automatically for all leases. This mechanism ensures no lease expires due to server side unavailability.
## Network partition
A network partition is similar to a minor followers failure or a leader failure. A network partition divides the etcd cluster into two parts; one with a member majority and the other with a member minority. The majority side becomes the available cluster and the minority side is unavailable; there is no “split-brain” in etcd.
If the leader is on the majority side, then from the majority point of view the failure is a minority follower failure. If the leader is on the minority side, then it is a leader failure. The leader on the minority side steps down and the majority side elects a new leader.
Once the network partition clears, the minority side automatically recognizes the leader from the majority side and recovers its state.
## Failure during bootstrapping
A cluster bootstrap is only successful if all required members successfully start. If any failure happens during bootstrapping, remove the data directories on all members and re-bootstrap the cluster with a new cluster-token or new discovery token.
Of course, it is possible to recover a failed bootstrapped cluster like recovering a running cluster. However, it almost always takes more time and resources to recover that cluster than bootstrapping a new one, since there is no data to recover.
[backup]: maintenance.md#snapshot-backup
[unrecoverable]: recovery.md#disaster-recovery

View File

@ -0,0 +1,105 @@
# etcd gateway
## What is etcd gateway
etcd gateway is a simple TCP proxy that forwards network data to the etcd cluster. The gateway is stateless and transparent; it neither inspects client requests nor interferes with cluster responses.
The gateway supports multiple etcd server endpoints and works on a simple round-robin policy. It only routes to available enpoints and hides failures from its clients. Other retry policies, such as weighted round-robin, may be supported in the future.
## When to use etcd gateway
Every application that accesses etcd must first have the address of an etcd cluster client endpoint. If multiple applications on the same server access the same etcd cluster, every application still needs to know the advertised client endpoints of the etcd cluster. If the etcd cluster is reconfigured to have different endpoints, every application may also need to update its endpoint list. This wide-scale reconfiguration is both tedious and error prone.
etcd gateway solves this problem by serving as a stable local endpoint. A typical etcd gateway configuration has each machine running a gateway listening on a local address and every etcd application connecting to its local gateway. The upshot is only the gateway needs to update its endpoints instead of updating each and every application.
In summary, to automatically propagate cluster endpoint changes, the etcd gateway runs on every machine serving multiple applications accessing the same etcd cluster.
## When not to use etcd gateway
- Improving performance
The gateway is not designed for improving etcd cluster performance. It does not provide caching, watch coalescing or batching. The etcd team is developing a caching proxy designed for improving cluster scalability.
- Running on a cluster management system
Advanced cluster management systems like Kubernetes natively support service discovery. Applications can access an etcd cluster with a DNS name or a virtual IP address managed by the system. For example, kube-proxy is equivalent to etcd gateway.
## Start etcd gateway
Consider an etcd cluster with the following static endpoints:
|Name|Address|Hostname|
|------|---------|------------------|
|infra0|10.0.1.10|infra0.example.com|
|infra1|10.0.1.11|infra1.example.com|
|infra2|10.0.1.12|infra2.example.com|
Start the etcd gateway to use these static endpoints with the command:
```bash
$ etcd gateway start --endpoints=infra0.example.com,infra1.example.com,infra2.example.com
2016-08-16 11:21:18.867350 I | tcpproxy: ready to proxy client requests to [...]
```
Alternatively, if using DNS for service discovery, consider the DNS SRV entries:
```bash
$ dig +noall +answer SRV _etcd-client._tcp.example.com
_etcd-client._tcp.example.com. 300 IN SRV 0 0 2379 infra0.example.com.
_etcd-client._tcp.example.com. 300 IN SRV 0 0 2379 infra1.example.com.
_etcd-client._tcp.example.com. 300 IN SRV 0 0 2379 infra2.example.com.
```
```bash
$ dig +noall +answer infra0.example.com infra1.example.com infra2.example.com
infra0.example.com. 300 IN A 10.0.1.10
infra1.example.com. 300 IN A 10.0.1.11
infra2.example.com. 300 IN A 10.0.1.12
```
Start the etcd gateway to fetch the endpoints from the DNS SRV entries with the command:
```bash
$ etcd gateway --discovery-srv=example.com
2016-08-16 11:21:18.867350 I | tcpproxy: ready to proxy client requests to [...]
```
## Configuration flags
### etcd cluster
#### --endpoints
* Comma-separated list of etcd server targets for forwarding client connections.
* Default: `127.0.0.1:2379`
* Invalid example: `https://127.0.0.1:2379` (gateway does not terminate TLS)
#### --discovery-srv
* DNS domain used to bootstrap cluster endpoints through SRV recrods.
* Default: (not set)
### Network
#### --listen-addr
* Interface and port to bind for accepting client requests.
* Default: `127.0.0.1:23790`
#### --retry-delay
* Duration of delay before retrying to connect to failed endpoints.
* Default: 1m0s
* Invalid example: "123" (expects time unit in format)
### Security
#### --insecure-discovery
* Accept SRV records that are insecure or susceptible to man-in-the-middle attacks.
* Default: `false`
#### --trusted-ca-file
* Path to the client TLS CA file for the etcd cluster. Used to authenticate endpoints.
* Default: (not set)

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,193 @@
# gRPC proxy
The gRPC proxy is a stateless etcd reverse proxy operating at the gRPC layer (L7). The proxy is designed to reduce the total processing load on the core etcd cluster. For horizontal scalability, it coalesces watch and lease API requests. To protect the cluster against abusive clients, it caches key range requests.
The gRPC proxy supports multiple etcd server endpoints. When the proxy starts, it randomly picks one etcd server endpoint to use. This endpoint serves all requests until the proxy detects an endpoint failure. If the gRPC proxy detects an endpoint failure, it switches to a different endpoint, if available, to hide failures from its clients. Other retry policies, such as weighted round-robin, may be supported in the future.
## Scalable watch API
The gRPC proxy coalesces multiple client watchers (`c-watchers`) on the same key or range into a single watcher (`s-watcher`) connected to an etcd server. The proxy broadcasts all events from the `s-watcher` to its `c-watchers`.
Assuming N clients watch the same key, one gRPC proxy can reduce the watch load on the etcd server from N to 1. Users can deploy multiple gRPC proxies to further distribute server load.
In the following example, three clients watch on key A. The gRPC proxy coalesces the three watchers, creating a single watcher attached to the etcd server.
```
+-------------+
| etcd server |
+------+------+
^ watch key A (s-watcher)
|
+-------+-----+
| gRPC proxy | <-------+
| | |
++-----+------+ |watch key A (c-watcher)
watch key A ^ ^ watch key A |
(c-watcher) | | (c-watcher) |
+-------+-+ ++--------+ +----+----+
| client | | client | | client |
| | | | | |
+---------+ +---------+ +---------+
```
### Limitations
To effectively coalesce multiple client watchers into a single watcher, the gRPC proxy coalesces new `c-watchers` into an existing `s-watcher` when possible. This coalesced `s-watcher` may be out of sync with the etcd server due to network delays or buffered undelivered events. When the watch revision is unspecified, the gRPC proxy will not guarantee the `c-watcher` will start watching from the most recent store revision. For example, if a client watches from an etcd server with revision 1000, that watcher will begin at revision 1000. If a client watches from the gRPC proxy, may begin watching from revision 990.
Similar limitations apply to cancellation. When the watcher is cancelled, the etcd servers revision may be greater than the cancellation response revision.
These two limitations should not cause problems for most use cases. In the future, there may be additional options to force the watcher to bypass the gRPC proxy for more accurate revision responses.
## Scalable lease API
To keep its leases alive, a client must establish at least one gRPC stream to an etcd server for sending periodic heartbeats. If an etcd workload involves heavy lease activity spread over many clients, these streams may contribute to excessive CPU utilization. To reduce the total number of streams on the core cluster, the proxy supports lease stream coalescing.
Assuming N clients are updating leases, a single gRPC proxy reduces the stream load on the etcd server from N to 1. Deployments may have additional gRPC proxies to further distribute streams across multiple proxies.
In the following example, three clients update three independent leases (`L1`, `L2`, and `L3`). The gRPC proxy coalesces the three client lease streams (`c-streams`) into a single lease keep alive stream (`s-stream`) attached to an etcd server. The proxy forwards client-side lease heartbeats from the c-streams to the s-stream, then returns the responses to the corresponding c-streams.
```
+-------------+
| etcd server |
+------+------+
^
| heartbeat L1, L2, L3
| (s-stream)
v
+-------+-----+
| gRPC proxy +<-----------+
+---+------+--+ | heartbeat L3
^ ^ | (c-stream)
heartbeat L1 | | heartbeat L2 |
(c-stream) v v (c-stream) v
+------+-+ +-+------+ +-----+--+
| client | | client | | client |
+--------+ +--------+ +--------+
```
## Abusive clients protection
The gRPC proxy caches responses for requests when it does not break consistency requirements. This can protect the etcd server from abusive clients in tight for loops.
## Start etcd gRPC proxy
Consider an etcd cluster with the following static endpoints:
|Name|Address|Hostname|
|------|---------|------------------|
|infra0|10.0.1.10|infra0.example.com|
|infra1|10.0.1.11|infra1.example.com|
|infra2|10.0.1.12|infra2.example.com|
Start the etcd gRPC proxy to use these static endpoints with the command:
```bash
$ etcd grpc-proxy start --endpoints=infra0.example.com,infra1.example.com,infra2.example.com --listen-addr=127.0.0.1:2379
```
The etcd gRPC proxy starts and listens on port 8080. It forwards client requests to one of the three endpoints provided above.
Sending requests through the proxy:
```bash
$ ETCDCTL_API=3 ./etcdctl --endpoints=127.0.0.1:2379 put foo bar
OK
$ ETCDCTL_API=3 ./etcdctl --endpoints=127.0.0.1:2379 get foo
foo
bar
```
## Client endpoint synchronization and name resolution
The proxy supports registering its endpoints for discovery by writing to a user-defined endpoint. This serves two purposes. First, it allows clients to synchronize their endpoints against a set of proxy endpoints for high availability. Second, it is an endpoint provider for etcd [gRPC naming](../dev-guide/grpc_naming.md).
Register proxy(s) by providing a user-defined prefix:
```bash
$ etcd grpc-proxy start --endpoints=localhost:2379 \
--listen-addr=127.0.0.1:23790 \
--advertise-client-url=127.0.0.1:23790 \
--resolver-prefix="___grpc_proxy_endpoint" \
--resolver-ttl=60
$ etcd grpc-proxy start --endpoints=localhost:2379 \
--listen-addr=127.0.0.1:23791 \
--advertise-client-url=127.0.0.1:23791 \
--resolver-prefix="___grpc_proxy_endpoint" \
--resolver-ttl=60
```
The proxy will list all its members for member list:
```bash
ETCDCTL_API=3 ./bin/etcdctl --endpoints=http://localhost:23790 member list --write-out table
+----+---------+--------------------------------+------------+-----------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+----+---------+--------------------------------+------------+-----------------+
| 0 | started | Gyu-Hos-MBP.sfo.coreos.systems | | 127.0.0.1:23791 |
| 0 | started | Gyu-Hos-MBP.sfo.coreos.systems | | 127.0.0.1:23790 |
+----+---------+--------------------------------+------------+-----------------+
```
This lets clients automatically discover proxy endpoints through Sync:
```go
cli, err := clientv3.New(clientv3.Config{
Endpoints: []string{"http://localhost:23790"},
})
if err != nil {
log.Fatal(err)
}
defer cli.Close()
// fetch registered grpc-proxy endpoints
if err := cli.Sync(context.Background()); err != nil {
log.Fatal(err)
}
```
Note that if a proxy is configured without a resolver prefix,
```bash
$ etcd grpc-proxy start --endpoints=localhost:2379 \
--listen-addr=127.0.0.1:23792 \
--advertise-client-url=127.0.0.1:23792
```
the member list API to the grpc-proxy returns its own `advertise-client-url`:
```bash
ETCDCTL_API=3 ./bin/etcdctl --endpoints=http://localhost:23792 member list --write-out table
+----+---------+--------------------------------+------------+-----------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+----+---------+--------------------------------+------------+-----------------+
| 0 | started | Gyu-Hos-MBP.sfo.coreos.systems | | 127.0.0.1:23792 |
+----+---------+--------------------------------+------------+-----------------+
```
## Namespacing
Suppose an application expects full control over the entire key space, but the etcd cluster is shared with other applications. To let all appications run without interfering with each other, the proxy can partition the etcd keyspace so clients appear to have access to the complete keyspace. When the proxy is given the flag `--namespace`, all client requests going into the proxy are translated to have a user-defined prefix on the keys. Accesses to the etcd cluster will be under the prefix and responses from the proxy will strip away the prefix; to the client, it appears as if there is no prefix at all.
To namespace a proxy, start it with `--namespace`:
```bash
$ etcd grpc-proxy start --endpoints=localhost:2379 \
--listen-addr=127.0.0.1:23790 \
--namespace=my-prefix/
```
Accesses to the proxy are now transparently prefixed on the etcd cluster:
```bash
$ ETCDCTL_API=3 ./bin/etcdctl --endpoints=localhost:23790 put my-key abc
# OK
$ ETCDCTL_API=3 ./bin/etcdctl --endpoints=localhost:23790 get my-key
# my-key
# abc
$ ETCDCTL_API=3 ./bin/etcdctl --endpoints=localhost:2379 get my-prefix/my-key
# my-prefix/my-key
# abc
```

View File

@ -0,0 +1,93 @@
# Hardware recommendations
etcd usually runs well with limited resources for development or testing purposes; its common to develop with etcd on a laptop or a cheap cloud machine. However, when running etcd clusters in production, some hardware guidelines are useful for proper administration. These suggestions are not hard rules; they serve as a good starting point for a robust production deployment. As always, deployments should be tested with simulated workloads before running in production.
## CPUs
Few etcd deployments require a lot of CPU capacity. Typical clusters need two to four cores to run smoothly.
Heavily loaded etcd deployments, serving thousands of clients or tens of thousands of requests per second, tend to be CPU bound since etcd can serve requests from memory. Such heavy deployments usually need eight to sixteen dedicated cores.
## Memory
etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 8GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly.
## Disks
Fast disks are the most critical factor for etcd deployment performance and stability.
A slow disk will increase etcd request latency and potentially hurt cluster stability. Since etcds consensus protocol depends on persistently storing metadata to a log, a majority of etcd cluster members must write every request down to disk. Additionally, etcd will also incrementally checkpoint its state to disk so it can truncate this log. If these writes take too long, heartbeats may time out and trigger an election, undermining the stability of the cluster.
etcd is very sensitive to disk write latency. Typically 50 sequential IOPS (e.g., a 7200 RPM disk) is required. For heavily loaded clusters, 500 sequential IOPS (e.g., a typical local SSD or a high performance virtualized block device) is recommended. Note that most cloud providers publish concurrent IOPS rather than sequential IOPS; the published concurrent IOPS can be 10x greater than the sequential IOPS. To measure actual sequential IOPS, we suggest using a disk benchmarking tool such as [diskbench][diskbench] or [fio][fio].
etcd requires only modest disk bandwidth but more disk bandwidth buys faster recovery times when a failed member has to catch up with the cluster. Typically 10MB/s will recover 100MB data within 15 seconds. For large clusters, 100MB/s or higher is suggested for recovering 1GB data within 15 seconds.
When possible, back etcds storage with a SSD. A SSD usually provides lower write latencies and with less variance than a spinning disk, thus improving the stability and reliability of etcd. If using spinning disk, get the fastest disks possible (15,000 RPM). Using RAID 0 is also an effective way to increase disk speed, for both spinning disks and SSD. With at least three cluster members, mirroring and/or parity variants of RAID are unnecessary; etcd's consistent replication already gets high availability.
## Network
Multi-member etcd deployments benefit from a fast and reliable network. In order for etcd to be both consistent and partition tolerant, an unreliable network with partitioning outages will lead to poor availability. Low latency ensures etcd members can communicate fast. High bandwidth can reduce the time to recover a failed etcd member. 1GbE is sufficient for common etcd deployments. For large etcd clusters, a 10GbE network will reduce mean time to recovery.
Deploy etcd members within a single data center when possible to avoid latency overheads and lessen the possibility of partitioning events. If a failure domain in another data center is required, choose a data center closer to the existing one. Please also read the [tuning][tuning] documentation for more information on cross data center deployment.
## Example hardware configurations
Here are a few example hardware setups on AWS and GCE environments. As mentioned before, but must be stressed regardless, administrators should test an etcd deployment with a simulated workload before putting it into production.
Note that these configurations assume these machines are totally dedicated to etcd. Running other applications along with etcd on these machines may cause resource contentions and lead to cluster instability.
### Small cluster
A small cluster serves fewer than 100 clients, fewer than 200 of requests per second, and stores no more than 100MB of data.
Example application workload: A 50-node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|----------|------|-------|--------|------|----------------|
| AWS | m4.large | 2 | 8 | 3600 | 56.25 |
| GCE | n1-standard-1 + 50GB PD SSD | 2 | 7.5 | 1500 | 25 |
### Medium cluster
A medium cluster serves fewer than 500 clients, fewer than 1,000 of requests per second, and stores no more than 500MB of data.
Example application workload: A 250-node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|----------|------|-------|--------|------|----------------|
| AWS | m4.xlarge | 4 | 16 | 6000 | 93.75 |
| GCE | n1-standard-4 + 150GB PD SSD | 4 | 15 | 4500 | 75 |
### Large cluster
A large cluster serves fewer than 1,500 clients, fewer than 10,000 of requests per second, and stores no more than 1GB of data.
Example application workload: A 1,000-node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|----------|------|-------|--------|------|----------------|
| AWS | m4.2xlarge | 8 | 32 | 8000 | 125 |
| GCE | n1-standard-8 + 250GB PD SSD | 8 | 30 | 7500 | 125 |
### xLarge cluster
An xLarge cluster serves more than 1,500 clients, more than 10,000 of requests per second, and stores more than 1GB data.
Example application workload: A 3,000 node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|----------|------|-------|--------|------|----------------|
| AWS | m4.4xlarge | 16 | 64 | 16,000 | 250 |
| GCE | n1-standard-16 + 500GB PD SSD | 16 | 60 | 15,000 | 250 |
[diskbench]: https://github.com/ongardie/diskbenchmark
[fio]: https://github.com/axboe/fio
[tuning]: ../tuning.md

View File

@ -0,0 +1,114 @@
# Maintenance
## Overview
An etcd cluster needs periodic maintenance to remain reliable. Depending on an etcd application's needs, this maintenance can usually be automated and performed without downtime or significantly degraded performance.
All etcd maintenance manages storage resources consumed by the etcd keyspace. Failure to adequately control the keyspace size is guarded by storage space quotas; if an etcd member runs low on space, a quota will trigger cluster-wide alarms which will put the system into a limited-operation maintenance mode. To avoid running out of space for writes to the keyspace, the etcd keyspace history must be compacted. Storage space itself may be reclaimed by defragmenting etcd members. Finally, periodic snapshot backups of etcd member state makes it possible to recover any unintended logical data loss or corruption caused by operational error.
## History compaction
Since etcd keeps an exact history of its keyspace, this history should be periodically compacted to avoid performance degradation and eventual storage space exhaustion. Compacting the keyspace history drops all information about keys superseded prior to a given keyspace revision. The space used by these keys then becomes available for additional writes to the keyspace.
The keyspace can be compacted automatically with `etcd`'s time windowed history retention policy, or manually with `etcdctl`. The `etcdctl` method provides fine-grained control over the compacting process whereas automatic compacting fits applications that only need key history for some length of time.
`etcd` can be set to automatically compact the keyspace with the `--auto-compaction` option with a period of hours:
```sh
# keep one hour of history
$ etcd --auto-compaction-retention=1
```
An `etcdctl` initiated compaction works as follows:
```sh
# compact up to revision 3
$ etcdctl compact 3
```
Revisions prior to the compaction revision become inaccessible:
```sh
$ etcdctl get --rev=2 somekey
Error: rpc error: code = 11 desc = etcdserver: mvcc: required revision has been compacted
```
## Defragmentation
After compacting the keyspace, the backend database may exhibit internal fragmentation. Any internal fragmentation is space that is free to use by the backend but still consumes storage space. The process of defragmentation releases this storage space back to the file system. Defragmentation is issued on a per-member so that cluster-wide latency spikes may be avoided.
Compacting old revisions internally fragments `etcd` by leaving gaps in backend database. Fragmented space is available for use by `etcd` but unavailable to the host filesystem.
To defragment an etcd member, use the `etcdctl defrag` command:
```sh
$ etcdctl defrag
Finished defragmenting etcd member[127.0.0.1:2379]
```
## Space quota
The space quota in `etcd` ensures the cluster operates in a reliable fashion. Without a space quota, `etcd` may suffer from poor performance if the keyspace grows excessively large, or it may simply run out of storage space, leading to unpredictable cluster behavior. If the keyspace's backend database for any member exceeds the space quota, `etcd` raises a cluster-wide alarm that puts the cluster into a maintenance mode which only accepts key reads and deletes. Only after freeing enough space in the keyspace and defragmenting the backend database, along with clearing the space quota alarm can the cluster resume normal operation.
By default, `etcd` sets a conservative space quota suitable for most applications, but it may be configured on the command line, in bytes:
```sh
# set a very small 16MB quota
$ etcd --quota-backend-bytes=$((16*1024*1024))
```
The space quota can be triggered with a loop:
```sh
# fill keyspace
$ while [ 1 ]; do dd if=/dev/urandom bs=1024 count=1024 | ETCDCTL_API=3 etcdctl put key || break; done
...
Error: rpc error: code = 8 desc = etcdserver: mvcc: database space exceeded
# confirm quota space is exceeded
$ ETCDCTL_API=3 etcdctl --write-out=table endpoint status
+----------------+------------------+-----------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------+------------------+-----------+---------+-----------+-----------+------------+
| 127.0.0.1:2379 | bf9071f4639c75cc | 2.3.0+git | 18 MB | true | 2 | 3332 |
+----------------+------------------+-----------+---------+-----------+-----------+------------+
# confirm alarm is raised
$ ETCDCTL_API=3 etcdctl alarm list
memberID:13803658152347727308 alarm:NOSPACE
```
Removing excessive keyspace data and defragmenting the backend database will put the cluster back within the quota limits:
```sh
# get current revision
$ rev=$(ETCDCTL_API=3 etcdctl --endpoints=:2379 endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*')
# compact away all old revisions
$ ETCDCTL_API=3 etcdctl compact $rev
compacted revision 1516
# defragment away excessive space
$ ETCDCTL_API=3 etcdctl defrag
Finished defragmenting etcd member[127.0.0.1:2379]
# disarm alarm
$ ETCDCTL_API=3 etcdctl alarm disarm
memberID:13803658152347727308 alarm:NOSPACE
# test puts are allowed again
$ ETCDCTL_API=3 etcdctl put newkey 123
OK
```
## Snapshot backup
Snapshotting the `etcd` cluster on a regular basis serves as a durable backup for an etcd keyspace. By taking periodic snapshots of an etcd member's backend database, an `etcd` cluster can be recovered to a point in time with a known good state.
A snapshot is taken with `etcdctl`:
```sh
$ etcdctl snapshot save backup.db
$ etcdctl --write-out=table snapshot status backup.db
+----------+----------+------------+------------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| fe01cf57 | 10 | 7 | 2.1 MB |
+----------+----------+------------+------------+
```

View File

@ -0,0 +1,88 @@
# Monitoring etcd
Each etcd server exports metrics under the `/metrics` path on its client port.
The metrics can be fetched with `curl`:
```sh
$ curl -L http://localhost:2379/metrics
# HELP etcd_debugging_mvcc_keys_total Total number of keys.
# TYPE etcd_debugging_mvcc_keys_total gauge
etcd_debugging_mvcc_keys_total 0
# HELP etcd_debugging_mvcc_pending_events_total Total number of pending events to be sent.
# TYPE etcd_debugging_mvcc_pending_events_total gauge
etcd_debugging_mvcc_pending_events_total 0
...
```
## Prometheus
Running a [Prometheus][prometheus] monitoring service is the easiest way to ingest and record etcd's metrics.
First, install Prometheus:
```sh
PROMETHEUS_VERSION="1.3.1"
wget https://github.com/prometheus/prometheus/releases/download/v$PROMETHEUS_VERSION/prometheus-$PROMETHEUS_VERSION.linux-amd64.tar.gz -O /tmp/prometheus-$PROMETHEUS_VERSION.linux-amd64.tar.gz
tar -xvzf /tmp/prometheus-$PROMETHEUS_VERSION.linux-amd64.tar.gz --directory /tmp/ --strip-components=1
/tmp/prometheus -version
```
Set Prometheus's scraper to target the etcd cluster endpoints:
```sh
cat > /tmp/test-etcd.yaml <<EOF
global:
scrape_interval: 10s
scrape_configs:
- job_name: test-etcd
static_configs:
- targets: ['10.240.0.32:2379','10.240.0.33:2379','10.240.0.34:2379']
EOF
cat /tmp/test-etcd.yaml
```
Set up the Prometheus handler:
```sh
nohup /tmp/prometheus \
-config.file /tmp/test-etcd.yaml \
-web.listen-address ":9090" \
-storage.local.path "test-etcd.data" >> /tmp/test-etcd.log 2>&1 &
```
Now Prometheus will scrape etcd metrics every 10 seconds.
## Alerting
There is a [set of default alerts for etcd v3 clusters](./etcd3_alert.rules).
> Note: `job` labels may need to be adjusted to fit a particular need. The rules were written to apply to a single cluster so it is recommended to choose labels unique to a cluster.
## Grafana
[Grafana][grafana] has built-in Prometheus support; just add a Prometheus data source:
```
Name: test-etcd
Type: Prometheus
Url: http://localhost:9090
Access: proxy
```
Then import the default [etcd dashboard template][template] and customize. For instance, if Prometheus data source name is `my-etcd`, the `datasource` field values in JSON also need to be `my-etcd`.
See the [demo][demo].
Sample dashboard:
![](./etcd-sample-grafana.png)
[prometheus]: https://prometheus.io/
[grafana]: http://grafana.org/
[template]: ./grafana.json
[demo]: http://dash.etcd.io/dashboard/db/test-etcd

View File

@ -0,0 +1,70 @@
# Performance
## Understanding performance
etcd provides stable, sustained high performance. Two factors define performance: latency and throughput. Latency is the time taken to complete an operation. Throughput is the total operations completed within some time period. Usually average latency increases as the overall throughput increases when etcd accepts concurrent client requests. In common cloud environments, like a standard `n-4` on Google Compute Engine (GCE) or a comparable machine type on AWS, a three member etcd cluster finishes a request in less than one millisecond under light load, and can complete more than 30,000 requests per second under heavy load.
etcd uses the Raft consensus algorithm to replicate requests among members and reach agreement. Consensus performance, especially commit latency, is limited by two physical constraints: network IO latency and disk IO latency. The minimum time to finish an etcd request is the network Round Trip Time (RTT) between members, plus the time `fdatasync` requires to commit the data to permanant storage. The RTT within a datacenter may be as long as several hundred microseconds. A typical RTT within the United States is around 50ms, and can be as slow as 400ms between continents. The typical fdatasync latency for a spinning disk is about 10ms. For SSDs, the latency is often lower than 1ms. To increase throughput, etcd batches multiple requests together and submits them to Raft. This batching policy lets etcd attain high throughput despite heavy load.
There are other sub-systems which impact the overall performance of etcd. Each serialized etcd request must run through etcds boltdb-backed MVCC storage engine, which usually takes tens of microseconds to finish. Periodically etcd incrementally snapshots its recently applied requests, merging them back with the previous on-disk snapshot. This process may lead to a latency spike. Although this is usually not a problem on SSDs, it may double the observed latency on HDD. Likewise, inflight compactions can impact etcds performance. Fortunately, the impact is often insignificant since the compaction is staggered so it does not compete for resources with regular requests. The RPC system, gRPC, gives etcd a well-defined, extensible API, but it also introduces additional latency, especially for local reads.
## Benchmarks
Benchmarking etcd performance can be done with the [benchmark](https://github.com/coreos/etcd/tree/master/tools/benchmark) CLI tool included with etcd.
For some baseline performance numbers, we consider a three member etcd cluster with the following hardware configuration:
- Google Cloud Compute Engine
- 3 machines of 8 vCPUs + 16GB Memory + 50GB SSD
- 1 machine(client) of 16 vCPUs + 30GB Memory + 50GB SSD
- Ubuntu 17.04
- etcd 3.2.0, go 1.8.3
With this configuration, etcd can approximately write:
| Number of keys | Key size in bytes | Value size in bytes | Number of connections | Number of clients | Target etcd server | Average write QPS | Average latency per request | Average server RSS |
|---------------:|------------------:|--------------------:|----------------------:|------------------:|--------------------|------------------:|----------------------------:|-------------------:|
| 10,000 | 8 | 256 | 1 | 1 | leader only | 583 | 1.6ms | 48 MB |
| 100,000 | 8 | 256 | 100 | 1000 | leader only | 44,341 | 22ms | 124MB |
| 100,000 | 8 | 256 | 100 | 1000 | all members | 50,104 | 20ms | 126MB |
Sample commands are:
```sh
# write to leader
benchmark --endpoints=${HOST_1} --target-leader --conns=1 --clients=1 \
put --key-size=8 --sequential-keys --total=10000 --val-size=256
benchmark --endpoints=${HOST_1} --target-leader --conns=100 --clients=1000 \
put --key-size=8 --sequential-keys --total=100000 --val-size=256
# write to all members
benchmark --endpoints=${HOST_1},${HOST_2},${HOST_3} --conns=100 --clients=1000 \
put --key-size=8 --sequential-keys --total=100000 --val-size=256
```
Linearizable read requests go through a quorum of cluster members for consensus to fetch the most recent data. Serializable read requests are cheaper than linearizable reads since they are served by any single etcd member, instead of a quorum of members, in exchange for possibly serving stale data. etcd can read:
| Number of requests | Key size in bytes | Value size in bytes | Number of connections | Number of clients | Consistency | Average read QPS | Average latency per request |
|-------------------:|------------------:|--------------------:|----------------------:|------------------:|-------------|-----------------:|----------------------------:|
| 10,000 | 8 | 256 | 1 | 1 | Linearizable | 1,353 | 0.7ms |
| 10,000 | 8 | 256 | 1 | 1 | Serializable | 2,909 | 0.3ms |
| 100,000 | 8 | 256 | 100 | 1000 | Linearizable | 141,578 | 5.5ms |
| 100,000 | 8 | 256 | 100 | 1000 | Serializable | 185,758 | 2.2ms |
Sample commands are:
```sh
# Single connection read requests
benchmark --endpoints=${HOST_1},${HOST_2},${HOST_3} --conns=1 --clients=1 \
range YOUR_KEY --consistency=l --total=10000
benchmark --endpoints=${HOST_1},${HOST_2},${HOST_3} --conns=1 --clients=1 \
range YOUR_KEY --consistency=s --total=10000
# Many concurrent read requests
benchmark --endpoints=${HOST_1},${HOST_2},${HOST_3} --conns=100 --clients=1000 \
range YOUR_KEY --consistency=l --total=100000
benchmark --endpoints=${HOST_1},${HOST_2},${HOST_3} --conns=100 --clients=1000 \
range YOUR_KEY --consistency=s --total=100000
```
We encourage running the benchmark test when setting up an etcd cluster for the first time in a new environment to ensure the cluster achieves adequate performance; cluster latency and throughput can be sensitive to minor environment differences.

View File

@ -0,0 +1,63 @@
## Disaster recovery
etcd is designed to withstand machine failures. An etcd cluster automatically recovers from temporary failures (e.g., machine reboots) and tolerates up to *(N-1)/2* permanent failures for a cluster of N members. When a member permanently fails, whether due to hardware failure or disk corruption, it loses access to the cluster. If the cluster permanently loses more than *(N-1)/2* members then it disastrously fails, irrevocably losing quorum. Once quorum is lost, the cluster cannot reach consensus and therefore cannot continue accepting updates.
To recover from disastrous failure, etcd v3 provides snapshot and restore facilities to recreate the cluster without v3 key data loss. To recover v2 keys, refer to the [v2 admin guide][v2_recover].
[v2_recover]: ../v2/admin_guide.md#disaster-recovery
### Snapshotting the keyspace
Recovering a cluster first needs a snapshot of the keyspace from an etcd member. A snapshot may either be taken from a live member with the `etcdctl snapshot save` command or by copying the `member/snap/db` file from an etcd data directory. For example, the following command snapshots the keyspace served by `$ENDPOINT` to the file `snapshot.db`:
```sh
$ ETCDCTL_API=3 etcdctl --endpoints $ENDPOINT snapshot save snapshot.db
```
### Restoring a cluster
To restore a cluster, all that is needed is a single snapshot "db" file. A cluster restore with `etcdctl snapshot restore` creates new etcd data directories; all members should restore using the same snapshot. Restoring overwrites some snapshot metadata (specifically, the member ID and cluster ID); the member loses its former identity. This metadata overwrite prevents the new member from inadvertently joining an existing cluster. Therefore in order to start a cluster from a snapshot, the restore must start a new logical cluster.
Snapshot integrity may be optionally verified at restore time. If the snapshot is taken with `etcdctl snapshot save`, it will have an integrity hash that is checked by `etcdctl snapshot restore`. If the snapshot is copied from the data directory, there is no integrity hash and it will only restore by using `--skip-hash-check`.
A restore initializes a new member of a new cluster, with a fresh cluster configuration using `etcd`'s cluster configuration flags, but preserves the contents of the etcd keyspace. Continuing from the previous example, the following creates new etcd data directories (`m1.etcd`, `m2.etcd`, `m3.etcd`) for a three member cluster:
```sh
$ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
--name m1 \
--initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-advertise-peer-urls http://host1:2380
$ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
--name m2 \
--initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-advertise-peer-urls http://host2:2380
$ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
--name m3 \
--initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-advertise-peer-urls http://host3:2380
```
Next, start `etcd` with the new data directories:
```sh
$ etcd \
--name m1 \
--listen-client-urls http://host1:2379 \
--advertise-client-urls http://host1:2379 \
--listen-peer-urls http://host1:2380 &
$ etcd \
--name m2 \
--listen-client-urls http://host2:2379 \
--advertise-client-urls http://host2:2379 \
--listen-peer-urls http://host2:2380 &
$ etcd \
--name m3 \
--listen-client-urls http://host3:2379 \
--advertise-client-urls http://host3:2379 \
--listen-peer-urls http://host3:2380 &
```
Now the restored etcd cluster should be available and serving the keyspace given by the snapshot.

View File

@ -0,0 +1,173 @@
# Runtime reconfiguration
etcd comes with support for incremental runtime reconfiguration, which allows users to update the membership of the cluster at run time.
Reconfiguration requests can only be processed when a majority of cluster members are functioning. It is **highly recommended** to always have a cluster size greater than two in production. It is unsafe to remove a member from a two member cluster. The majority of a two member cluster is also two. If there is a failure during the removal process, the cluster might not able to make progress and need to [restart from majority failure][majority failure].
To better understand the design behind runtime reconfiguration, please read [the runtime reconfiguration document][runtime-reconf].
## Reconfiguration use cases
This section will walk through some common reasons for reconfiguring a cluster. Most of these reasons just involve combinations of adding or removing a member, which are explained below under [Cluster Reconfiguration Operations][cluster-reconf].
### Cycle or upgrade multiple machines
If multiple cluster members need to move due to planned maintenance (hardware upgrades, network downtime, etc.), it is recommended to modify members one at a time.
It is safe to remove the leader, however there is a brief period of downtime while the election process takes place. If the cluster holds more than 50MB of v2 data, it is recommended to [migrate the member's data directory][member migration].
### Change the cluster size
Increasing the cluster size can enhance [failure tolerance][fault tolerance table] and provide better read performance. Since clients can read from any member, increasing the number of members increases the overall serialized read throughput.
Decreasing the cluster size can improve the write performance of a cluster, with a trade-off of decreased resilience. Writes into the cluster are replicated to a majority of members of the cluster before considered committed. Decreasing the cluster size lowers the majority, and each write is committed more quickly.
### Replace a failed machine
If a machine fails due to hardware failure, data directory corruption, or some other fatal situation, it should be replaced as soon as possible. Machines that have failed but haven't been removed adversely affect the quorum and reduce the tolerance for an additional failure.
To replace the machine, follow the instructions for [removing the member][remove member] from the cluster, and then [add a new member][add member] in its place. If the cluster holds more than 50MB, it is recommended to [migrate the failed member's data directory][member migration] if it is still accessible.
### Restart cluster from majority failure
If the majority of the cluster is lost or all of the nodes have changed IP addresses, then manual action is necessary to recover safely. The basic steps in the recovery process include [creating a new cluster using the old data][disaster recovery], forcing a single member to act as the leader, and finally using runtime configuration to [add new members][add member] to this new cluster one at a time.
## Cluster reconfiguration operations
With these use cases in mind, the involved operations can be described for each.
Before making any change, a simple majority (quorum) of etcd members must be available. This is essentially the same requirement for any kind of write to etcd.
All changes to the cluster must be done sequentially:
* To update a single member peerURLs, issue an update operation
* To replace a healthy single member, add a new member then remove the old member
* To increase from 3 to 5 members, issue two add operations
* To decrease from 5 to 3, issue two remove operations
All of these examples use the `etcdctl` command line tool that ships with etcd. To change membership without `etcdctl`, use the [v2 HTTP members API][member-api] or the [v3 gRPC members API][member-api-grpc].
### Update a member
#### Update advertise client URLs
To update the advertise client URLs of a member, simply restart that member with updated client urls flag (`--advertise-client-urls`) or environment variable (`ETCD_ADVERTISE_CLIENT_URLS`). The restarted member will self publish the updated URLs. A wrongly updated client URL will not affect the health of the etcd cluster.
#### Update advertise peer URLs
To update the advertise peer URLs of a member, first update it explicitly via member command and then restart the member. The additional action is required since updating peer URLs changes the cluster wide configuration and can affect the health of the etcd cluster.
To update the peer URLs, first find the target member's ID. To list all members with `etcdctl`:
```sh
$ etcdctl member list
6e3bd23ae5f1eae0: name=node2 peerURLs=http://localhost:23802 clientURLs=http://127.0.0.1:23792
924e2e83e93f2560: name=node3 peerURLs=http://localhost:23803 clientURLs=http://127.0.0.1:23793
a8266ecf031671f3: name=node1 peerURLs=http://localhost:23801 clientURLs=http://127.0.0.1:23791
```
This example will `update` a8266ecf031671f3 member ID and change its peerURLs value to `http://10.0.1.10:2380`:
```sh
$ etcdctl member update a8266ecf031671f3 http://10.0.1.10:2380
Updated member with ID a8266ecf031671f3 in cluster
```
### Remove a member
Suppose the member ID to remove is a8266ecf031671f3. Use the `remove` command to perform the removal:
```sh
$ etcdctl member remove a8266ecf031671f3
Removed member a8266ecf031671f3 from cluster
```
The target member will stop itself at this point and print out the removal in the log:
```
etcd: this member has been permanently removed from the cluster. Exiting.
```
It is safe to remove the leader, however the cluster will be inactive while a new leader is elected. This duration is normally the period of election timeout plus the voting process.
### Add a new member
Adding a member is a two step process:
* Add the new member to the cluster via the [HTTP members API][member-api], the [gRPC members API][member-api-grpc], or the `etcdctl member add` command.
* Start the new member with the new cluster configuration, including a list of the updated members (existing members + the new member).
`etcdctl` adds a new member to the cluster by specifying the member's [name][conf-name] and [advertised peer URLs][conf-adv-peer]:
```sh
$ etcdctl member add infra3 http://10.0.1.13:2380
added member 9bf1b35fc7761a23 to cluster
ETCD_NAME="infra3"
ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra3=http://10.0.1.13:2380"
ETCD_INITIAL_CLUSTER_STATE=existing
```
`etcdctl` has informed the cluster about the new member and printed out the environment variables needed to successfully start it. Now start the new etcd process with the relevant flags for the new member:
```sh
$ export ETCD_NAME="infra3"
$ export ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra3=http://10.0.1.13:2380"
$ export ETCD_INITIAL_CLUSTER_STATE=existing
$ etcd --listen-client-urls http://10.0.1.13:2379 --advertise-client-urls http://10.0.1.13:2379 --listen-peer-urls http://10.0.1.13:2380 --initial-advertise-peer-urls http://10.0.1.13:2380 --data-dir %data_dir%
```
The new member will run as a part of the cluster and immediately begin catching up with the rest of the cluster.
If adding multiple members the best practice is to configure a single member at a time and verify it starts correctly before adding more new members. If adding a new member to a 1-node cluster, the cluster cannot make progress before the new member starts because it needs two members as majority to agree on the consensus. This behavior only happens between the time `etcdctl member add` informs the cluster about the new member and the new member successfully establishing a connection to the existing one.
#### Error cases when adding members
In the following case a new host is not included in the list of enumerated nodes. If this is a new cluster, the node must be added to the list of initial cluster members.
```sh
$ etcd --name infra3 \
--initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
--initial-cluster-state existing
etcdserver: assign ids error: the member count is unequal
exit 1
```
In this case, give a different address (10.0.1.14:2380) from the one used to join the cluster (10.0.1.13:2380):
```sh
$ etcd --name infra4 \
--initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra4=http://10.0.1.14:2380 \
--initial-cluster-state existing
etcdserver: assign ids error: unmatched member while checking PeerURLs
exit 1
```
If etcd starts using the data directory of a removed member, etcd automatically exits if it connects to any active member in the cluster:
```sh
$ etcd
etcd: this member has been permanently removed from the cluster. Exiting.
exit 1
```
### Strict reconfiguration check mode (`-strict-reconfig-check`)
As described in the above, the best practice of adding new members is to configure a single member at a time and verify it starts correctly before adding more new members. This step by step approach is very important because if newly added members is not configured correctly (for example the peer URLs are incorrect), the cluster can lose quorum. The quorum loss happens since the newly added member are counted in the quorum even if that member is not reachable from other existing members. Also quorum loss might happen if there is a connectivity issue or there are operational issues.
For avoiding this problem, etcd provides an option `-strict-reconfig-check`. If this option is passed to etcd, etcd rejects reconfiguration requests if the number of started members will be less than a quorum of the reconfigured cluster.
It is enabled by default.
[add member]: #add-a-new-member
[cluster-reconf]: #cluster-reconfiguration-operations
[conf-adv-peer]: configuration.md#-initial-advertise-peer-urls
[conf-name]: configuration.md#-name
[disaster recovery]: recovery.md
[fault tolerance table]: ../v2/admin_guide.md#fault-tolerance-table
[majority failure]: #restart-cluster-from-majority-failure
[member-api]: ../v2/members_api.md
[member-api-grpc]: ../dev-guide/api_reference_v3.md#service-cluster-etcdserveretcdserverpbrpcproto
[member migration]: ../v2/admin_guide.md#member-migration
[remove member]: #remove-a-member
[runtime-reconf]: runtime-reconf-design.md

View File

@ -0,0 +1,50 @@
# Design of runtime reconfiguration
Runtime reconfiguration is one of the hardest and most error prone features in a distributed system, especially in a consensus based system like etcd.
Read on to learn about the design of etcd's runtime reconfiguration commands and how we tackled these problems.
## Two phase config changes keep the cluster safe
In etcd, every runtime reconfiguration has to go through [two phases][add-member] for safety reasons. For example, to add a member, first inform cluster of new configuration and then start the new member.
Phase 1 - Inform cluster of new configuration
To add a member into etcd cluster, make an API call to request a new member to be added to the cluster. This is only way to add a new member into an existing cluster. The API call returns when the cluster agrees on the configuration change.
Phase 2 - Start new member
To join the etcd member into the existing cluster, specify the correct `initial-cluster` and set `initial-cluster-state` to `existing`. When the member starts, it will contact the existing cluster first and verify the current cluster configuration matches the expected one specified in `initial-cluster`. When the new member successfully starts, the cluster has reached the expected configuration.
By splitting the process into two discrete phases users are forced to be explicit regarding cluster membership changes. This actually gives users more flexibility and makes things easier to reason about. For example, if there is an attempt to add a new member with the same ID as an existing member in an etcd cluster, the action will fail immediately during phase one without impacting the running cluster. Similar protection is provided to prevent adding new members by mistake. If a new etcd member attempts to join the cluster before the cluster has accepted the configuration change,, it will not be accepted by the cluster.
Without the explicit workflow around cluster membership etcd would be vulnerable to unexpected cluster membership changes. For example, if etcd is running under an init system such as systemd, etcd would be restarted after being removed via the membership API, and attempt to rejoin the cluster on startup. This cycle would continue every time a member is removed via the API and systemd is set to restart etcd after failing, which is unexpected.
We expect runtime reconfiguration to be an infrequent operation. We decided to keep it explicit and user-driven to ensure configuration safety and keep the cluster always running smoothly under explicit control.
## Permanent loss of quorum requires new cluster
If a cluster permanently loses a majority of its members, a new cluster will need to be started from an old data directory to recover the previous state.
It is entirely possible to force removing the failed members from the existing cluster to recover. However, we decided not to support this method since it bypasses the normal consensus committing phase, which is unsafe. If the member to remove is not actually dead or force removed through different members in the same cluster, etcd will end up with a diverged cluster with same clusterID. This is very dangerous and hard to debug/fix afterwards.
With a correct deployment, the possibility of permanent majority lose is very low. But it is a severe enough problem that worth special care. We strongly suggest reading the [disaster recovery documentation][disaster-recovery] and prepare for permanent majority lose before putting etcd into production.
## Do not use public discovery service for runtime reconfiguration
The public discovery service should only be used for bootstrapping a cluster. To join member into an existing cluster, use runtime reconfiguration API.
Discovery service is designed for bootstrapping an etcd cluster in the cloud environment, when the IP addresses of all the members are not known beforehand. After successfully bootstrapping a cluster, the IP addresses of all the members are known. Technically, the discovery service should no longer be needed.
It seems that using public discovery service is a convenient way to do runtime reconfiguration, after all discovery service already has all the cluster configuration information. However relying on public discovery service brings troubles:
1. it introduces external dependencies for the entire life-cycle of the cluster, not just bootstrap time. If there is a network issue between the cluster and public discovery service, the cluster will suffer from it.
2. public discovery service must reflect correct runtime configuration of the cluster during it life-cycle. It has to provide security mechanism to avoid bad actions, and it is hard.
3. public discovery service has to keep tens of thousands of cluster configurations. Our public discovery service backend is not ready for that workload.
To have a discovery service that supports runtime reconfiguration, the best choice is to build a private one.
[add-member]: runtime-configuration.md#add-a-new-member
[disaster-recovery]: recovery.md

View File

@ -0,0 +1,225 @@
# Security model
etcd supports automatic TLS as well as authentication through client certificates for both clients to server as well as peer (server to server / cluster) communication.
To get up and running, first have a CA certificate and a signed key pair for one member. It is recommended to create and sign a new key pair for every member in a cluster.
For convenience, the [cfssl] tool provides an easy interface to certificate generation, and we provide an example using the tool [here][tls-setup]. Alternatively, try this [guide to generating self-signed key pairs][tls-guide].
## Basic setup
etcd takes several certificate related configuration options, either through command-line flags or environment variables:
**Client-to-server communication:**
`--cert-file=<path>`: Certificate used for SSL/TLS connections **to** etcd. When this option is set, advertise-client-urls can use the HTTPS schema.
`--key-file=<path>`: Key for the certificate. Must be unencrypted.
`--client-cert-auth`: When this is set etcd will check all incoming HTTPS requests for a client certificate signed by the trusted CA, requests that don't supply a valid client certificate will fail. If [authentication][auth] is enabled, the certificate provides credentials for the user name given by the Common Name field.
`--trusted-ca-file=<path>`: Trusted certificate authority.
`--auto-tls`: Use automatically generated self-signed certificates for TLS connections with clients.
**Peer (server-to-server / cluster) communication:**
The peer options work the same way as the client-to-server options:
`--peer-cert-file=<path>`: Certificate used for SSL/TLS connections between peers. This will be used both for listening on the peer address as well as sending requests to other peers.
`--peer-key-file=<path>`: Key for the certificate. Must be unencrypted.
`--peer-client-cert-auth`: When set, etcd will check all incoming peer requests from the cluster for valid client certificates signed by the supplied CA.
`--peer-trusted-ca-file=<path>`: Trusted certificate authority.
`--peer-auto-tls`: Use automatically generated self-signed certificates for TLS connections between peers.
If either a client-to-server or peer certificate is supplied the key must also be set. All of these configuration options are also available through the environment variables, `ETCD_CA_FILE`, `ETCD_PEER_CA_FILE` and so on.
## Example 1: Client-to-server transport security with HTTPS
For this, have a CA certificate (`ca.crt`) and signed key pair (`server.crt`, `server.key`) ready.
Let us configure etcd to provide simple HTTPS transport security step by step:
```sh
$ etcd --name infra0 --data-dir infra0 \
--cert-file=/path/to/server.crt --key-file=/path/to/server.key \
--advertise-client-urls=https://127.0.0.1:2379 --listen-client-urls=https://127.0.0.1:2379
```
This should start up fine and it will be possible to test the configuration by speaking HTTPS to etcd:
```sh
$ curl --cacert /path/to/ca.crt https://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -v
```
The command should show that the handshake succeed. Since we use self-signed certificates with our own certificate authority, the CA must be passed to curl using the `--cacert` option. Another possibility would be to add the CA certificate to the system's trusted certificates directory (usually in `/etc/pki/tls/certs` or `/etc/ssl/certs`).
**OSX 10.9+ Users**: curl 7.30.0 on OSX 10.9+ doesn't understand certificates passed in on the command line.
Instead, import the dummy ca.crt directly into the keychain or add the `-k` flag to curl to ignore errors.
To test without the `-k` flag, run `open ./fixtures/ca/ca.crt` and follow the prompts.
Please remove this certificate after testing!
If there is a workaround, let us know.
## Example 2: Client-to-server authentication with HTTPS client certificates
For now we've given the etcd client the ability to verify the server identity and provide transport security. We can however also use client certificates to prevent unauthorized access to etcd.
The clients will provide their certificates to the server and the server will check whether the cert is signed by the supplied CA and decide whether to serve the request.
The same files mentioned in the first example are needed for this, as well as a key pair for the client (`client.crt`, `client.key`) signed by the same certificate authority.
```sh
$ etcd --name infra0 --data-dir infra0 \
--client-cert-auth --trusted-ca-file=/path/to/ca.crt --cert-file=/path/to/server.crt --key-file=/path/to/server.key \
--advertise-client-urls https://127.0.0.1:2379 --listen-client-urls https://127.0.0.1:2379
```
Now try the same request as above to this server:
```sh
$ curl --cacert /path/to/ca.crt https://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -v
```
The request should be rejected by the server:
```
...
routines:SSL3_READ_BYTES:sslv3 alert bad certificate
...
```
To make it succeed, we need to give the CA signed client certificate to the server:
```sh
$ curl --cacert /path/to/ca.crt --cert /path/to/client.crt --key /path/to/client.key \
-L https://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -v
```
The output should include:
```
...
SSLv3, TLS handshake, CERT verify (15):
...
TLS handshake, Finished (20)
```
And also the response from the server:
```json
{
"action": "set",
"node": {
"createdIndex": 12,
"key": "/foo",
"modifiedIndex": 12,
"value": "bar"
}
}
```
## Example 3: Transport security & client certificates in a cluster
etcd supports the same model as above for **peer communication**, that means the communication between etcd members in a cluster.
Assuming we have our `ca.crt` and two members with their own keypairs (`member1.crt` & `member1.key`, `member2.crt` & `member2.key`) signed by this CA, we launch etcd as follows:
```sh
DISCOVERY_URL=... # from https://discovery.etcd.io/new
# member1
$ etcd --name infra1 --data-dir infra1 \
--peer-client-cert-auth --peer-trusted-ca-file=/path/to/ca.crt --peer-cert-file=/path/to/member1.crt --peer-key-file=/path/to/member1.key \
--initial-advertise-peer-urls=https://10.0.1.10:2380 --listen-peer-urls=https://10.0.1.10:2380 \
--discovery ${DISCOVERY_URL}
# member2
$ etcd --name infra2 --data-dir infra2 \
--peer-client-cert-auth --peer-trusted-ca-file=/path/to/ca.crt --peer-cert-file=/path/to/member2.crt --peer-key-file=/path/to/member2.key \
--initial-advertise-peer-urls=https://10.0.1.11:2380 --listen-peer-urls=https://10.0.1.11:2380 \
--discovery ${DISCOVERY_URL}
```
The etcd members will form a cluster and all communication between members in the cluster will be encrypted and authenticated using the client certificates. The output of etcd will show that the addresses it connects to use HTTPS.
## Example 4: Automatic self-signed transport security
For cases where communication encryption, but not authentication, is needed, etcd supports encrypting its messages with automatically generated self-signed certificates. This simplifies deployment because there is no need for managing certificates and keys outside of etcd.
Configure etcd to use self-signed certificates for client and peer connections with the flags `--auto-tls` and `--peer-auto-tls`:
```sh
DISCOVERY_URL=... # from https://discovery.etcd.io/new
# member1
$ etcd --name infra1 --data-dir infra1 \
--auto-tls --peer-auto-tls \
--initial-advertise-peer-urls=https://10.0.1.10:2380 --listen-peer-urls=https://10.0.1.10:2380 \
--discovery ${DISCOVERY_URL}
# member2
$ etcd --name infra2 --data-dir infra2 \
--auto-tls --peer-auto-tls \
--initial-advertise-peer-urls=https://10.0.1.11:2380 --listen-peer-urls=https://10.0.1.11:2380 \
--discovery ${DISCOVERY_URL}
```
Self-signed certificates do not authenticate identity so curl will return an error:
```sh
curl: (60) SSL certificate problem: Invalid certificate chain
```
To disable certificate chain checking, invoke curl with the `-k` flag:
```sh
$ curl -k https://127.0.0.1:2379/v2/keys/foo -Xput -d value=bar -v
```
## Notes for etcd proxy
etcd proxy terminates the TLS from its client if the connection is secure, and uses proxy's own key/cert specified in `--peer-key-file` and `--peer-cert-file` to communicate with etcd members.
The proxy communicates with etcd members through both the `--advertise-client-urls` and `--advertise-peer-urls` of a given member. It forwards client requests to etcd members advertised client urls, and it syncs the initial cluster configuration through etcd members advertised peer urls.
When client authentication is enabled for an etcd member, the administrator must ensure that the peer certificate specified in the proxy's `--peer-cert-file` option is valid for that authentication. The proxy's peer certificate must also be valid for peer authentication if peer authentication is enabled.
## Frequently asked questions
### I'm seeing a SSLv3 alert handshake failure when using TLS client authentication?
The `crypto/tls` package of `golang` checks the key usage of the certificate public key before using it.
To use the certificate public key to do client auth, we need to add `clientAuth` to `Extended Key Usage` when creating the certificate public key.
Here is how to do it:
Add the following section to openssl.cnf:
```
[ ssl_client ]
...
extendedKeyUsage = clientAuth
...
```
When creating the cert be sure to reference it in the `-extensions` flag:
```
$ openssl ca -config openssl.cnf -policy policy_anything -extensions ssl_client -out certs/machine.crt -infiles machine.csr
```
### With peer certificate authentication I receive "certificate is valid for 127.0.0.1, not $MY_IP"
Make sure to sign the certificates with a Subject Name the member's public IP address. The `etcd-ca` tool for example provides an `--ip=` option for its `new-cert` command.
The certificate needs to be signed for the member's FQDN in its Subject Name, use Subject Alternative Names (short IP SANs) to add the IP address. The `etcd-ca` tool provides `--domain=` option for its `new-cert` command, and openssl can make [it][alt-name] too.
[cfssl]: https://github.com/cloudflare/cfssl
[tls-setup]: ../../hack/tls-setup
[tls-guide]: https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md
[alt-name]: http://wiki.cacert.org/FAQ/subjectAltName
[auth]: authentication.md

View File

@ -0,0 +1,40 @@
## Supported platforms
### Current support
The following table lists etcd support status for common architectures and operating systems,
| Architecture | Operating System | Status | Maintainers |
| ------------ | ---------------- | ------------ | --------------------------- |
| amd64 | Darwin | Experimental | etcd maintainers |
| amd64 | Linux | Stable | etcd maintainers |
| amd64 | Windows | Experimental | |
| arm64 | Linux | Experimental | @glevand |
| arm | Linux | Unstable | |
| 386 | Linux | Unstable | |
| ppc64le | Linux | Stable | etcd maintainers, @mkumatag |
* etcd-maintainers are listed in https://github.com/coreos/etcd/blob/master/MAINTAINERS.
Experimental platforms appear to work in practice and have some platform specific code in etcd, but do not fully conform to the stable support policy. Unstable platforms have been lightly tested, but less than experimental. Unlisted architecture and operating system pairs are currently unsupported; caveat emptor.
### Supporting a new platform
For etcd to officially support a new platform as stable, a few requirements are necessary to ensure acceptable quality:
1. An "official" maintainer for the platform with clear motivation; someone must be responsible for taking care of the platform.
2. Set up CI for build; etcd must compile.
3. Set up CI for running unit tests; etcd must pass simple tests.
4. Set up CI (TravisCI, SemaphoreCI or Jenkins) for running integration tests; etcd must pass intensive tests.
5. (Optional) Set up a functional testing cluster; an etcd cluster should survive stress testing.
### 32-bit and other unsupported systems
etcd has known issues on 32-bit systems due to a bug in the Go runtime. See the [Go issue][go-issue] and [atomic package][go-atomic] for more information.
To avoid inadvertently running a possibly unstable etcd server, `etcd` on unstable or unsupported architectures will print a warning message and immediately exit if the environment variable `ETCD_UNSUPPORTED_ARCH` is not set to the target architecture.
Currently amd64 and ppc64le architectures are officially supported by `etcd`.
[go-issue]: https://github.com/golang/go/issues/599
[go-atomic]: https://golang.org/pkg/sync/atomic/#pkg-note-BUG

View File

@ -0,0 +1,53 @@
# Migrate applications from using API v2 to API v3
The data store v2 is still accessible from the API v2 after upgrading to etcd3. Thus, it will work as before and require no application changes. With etcd 3, applications use the new grpc API v3 to access the mvcc store, which provides more features and improved performance. The mvcc store and the old store v2 are separate and isolated; writes to the store v2 will not affect the mvcc store and, similarly, writes to the mvcc store will not affect the store v2.
Migrating an application from the API v2 to the API v3 involves two steps: 1) migrate the client library and, 2) migrate the data. If the application can rebuild the data, then migrating the data is unnecessary.
## Migrate client library
API v3 is different from API v2, thus application developers need to use a new client library to send requests to etcd API v3. The documentation of the client v3 is available at https://godoc.org/github.com/coreos/etcd/clientv3.
There are some notable differences between API v2 and API v3:
- Transaction: In v3, etcd provides multi-key conditional transactions. Applications should use transactions in place of `Compare-And-Swap` operations.
- Flat key space: There are no directories in API v3, only keys. For example, "/a/b/c/" is a key. Range queries support getting all keys matching a given prefix.
- Compacted responses: Operations like `Delete` no longer return previous values. To get the deleted value, a transaction can be used to atomically get the key and then delete its value.
- Leases: A replacement for v2 TTLs; the TTL is bound to a lease and keys attach to the lease. When the TTL expires, the lease is revoked and all attached keys are removed.
## Migrate data
Application data can be migrated either offline or online. Offline migration is much simpler than online migration and is recommended.
Sometimes an etcd cluster will possibly have v3 data which should not be overwritten. In this case, the migration process may want to confirm no v3 data is committed before proceeding. One way to check the cluster has no v3 keys is to issue the following `etcdctl` command, which scans the entire v3 keyspace for any key, expecting `0` as output:
```sh
ETCDCTL_API=3 etcdctl get "" --from-key --keys-only --limit 1 | wc -l
```
### Offline migration
Offline migration is very simple but requires etcd downtime. If an etcd downtime window spanning from seconds to minutes is acceptable, offline migration is a good choice and is easy to automate.
First, all members in the etcd cluster must converge to the same state. This can be achieved by stopping all applications that write keys to etcd. Alternatively, if the applications must remain running, configure etcd to listen on a different client URL and restart all etcd members. To check if the states converged, within a few seconds, use the `ETCDCTL_API=3 etcdctl endpoint status` command to confirm that the `raft index` of all members match (or differ by at most 1 due to an internal sync raft command).
Second, migrate the v2 keys into v3 with the [migrate][migrate_command] (`ETCDCTL_API=3 etcdctl migrate`) command. The migrate command writes keys in the v2 store to a user-provided transformer program and reads back transformed keys. It then writes transformed keys into the mvcc store. This usually takes at most tens of seconds.
Restart the etcd members and everything should just work.
### Online migration
If the application cannot tolerate any downtime, then it must migrate online. The implementation of online migration will vary from application to application but the overall idea is the same.
First, write application code using the v3 API. The application must support two modes: a migration mode and a normal mode. The application starts in migration mode. When running in migration mode, the application reads keys using the v3 API first, and, if it cannot find the key, it retries with the API v2. In normal mode, the application only reads keys using the v3 API. The application writes keys over the API v3 in both modes. To acknowledge a switch from migration mode to normal mode, the application watches on a switch mode key. When switch keys value turns to `true`, the application switches over from migration mode to normal mode.
Second, start a background job to migrate data from the store v2 to the mvcc store by reading keys from the API v2 and writing keys to the API v3.
After finishing data migration, the background job writes `true` into the switch mode key to notify the application that it may switch modes.
Online migration can be difficult when the application logic depends on store v2 indexes. Applications will need additional logic to convert mvcc store revisions to store v2 indexes.
[migrate_command]: ../../etcdctl/README.md#migrate-options

View File

@ -0,0 +1,17 @@
## Versioning
### Service versioning
etcd uses [semantic versioning](http://semver.org)
New minor versions may add additional features to the API.
Get the running etcd cluster version with `etcdctl`:
```sh
ETCDCTL_API=3 etcdctl --endpoints=127.0.0.1:2379 endpoint status
```
### API versioning
The `v3` API responses should not change after the 3.0.0 release but new features will be added over time.

View File

@ -0,0 +1,77 @@
## Introduction
This guide assumes operational knowledge of Amazon Web Services (AWS), specifically Amazon Elastic Compute Cloud (EC2). This guide provides an introduction to design considerations when designing an etcd deployment on AWS EC2 and how AWS specific features may be utilized in that context.
## Capacity planning
As a critical building block for distributed systems it is crucial to perform adequate capacity planning in order to support the intended cluster workload. As a highly available and strongly consistent data store increasing the number of nodes in an etcd cluster will generally affect performance adversely. This makes sense intuitively, as more nodes means more members for the leader to coordinate state across. The most direct way to increase throughput and decrease latency of an etcd cluster is allocate more disk I/O, network I/O, CPU, and memory to cluster members. In the event it is impossible to temporarily divert incoming requests to the cluster, scaling the EC2 instances which comprise the etcd cluster members one at a time may improve performance. It is, however, best to avoid bottlenecks through capacity planning.
The etcd team has produced a [hardware recommendation guide]( ../op-guide/hardware.md) which is very useful for “ballparking” how many nodes and what instance type are necessary for a cluster.
AWS provides a service for creating groups of EC2 instances which are dynamically sized to match load on the instances. Using an Auto Scaling Group ([ASG](http://docs.aws.amazon.com/autoscaling/latest/userguide/AutoScalingGroup.html)) to dynamically scale an etcd cluster is not recommended for several reasons including:
* etcd performance is generally inversely proportional to the number of members in a cluster due to the synchronous replication which provides strong consistency of data stored in etcd
* the operational complexity of adding [lifecycle hooks](http://docs.aws.amazon.com/autoscaling/latest/userguide/lifecycle-hooks.html) to properly add and remove members from an etcd cluster by modifying the [runtime configuration](../op-guide/runtime-configuration.md)
Auto Scaling Groups do provide a number of benefits besides cluster scaling which include:
* distribution of EC2 instances across Availability Zones (AZs)
* EC2 instance fail over across AZs
* consolidated monitoring and life cycle control of instances within an ASG
The use of an ASG to create a [self healing etcd cluster](#self-healing) is one of the design considerations when deploying an etcd cluster to AWS.
## Cluster design
The purpose of this section is to provide foundational guidance for deploying etcd on AWS. The discussion will be framed by the following three critical design criteria about the etcd cluster itself:
* block device provider: limited to the tradeoffs between EBS or instance storage (InstanceStore)
* cluster topology: how many nodes should make up an etcd cluster; should these nodes be distributed over multiple AZs
* managing etcd members: creating a static cluster of EC2 instances or using an ASG.
The intended cluster workload should dictate the cluster design. A configuration store for microservices may require different design considerations than a distributed lock service, a secrets store, or a Kubernetes control plane. Cluster design tradeoffs include considerations such as:
* availability
* data durability after member failure
* performance/throughput
* self healing
### Availability
Instance availability on AWS is ultimately determined by the Amazon EC2 Region Service Level Agreement ([SLA](https://aws.amazon.com/ec2/sla/)) which is the policy by which Amazon describes their precise definition of a regional outage.
In the context of an etcd cluster this means a cluster must contain a minimum of three members where EC2 instances are spread across at least two AZs in order for an etcd cluster to be considered highly available at a Regional level.
For most use cases the additional latency associated with a cluster spanning across Availability Zones will introduce a negligible performance impact.
Availability considerations apply to all components of an application; if the application which accesses the etcd cluster will only be deployed to a single Availability Zone it may not make sense to make the etcd cluster highly available across zones.
### Data durability after member failure
A highly available etcd cluster is resilient to member loss, however, it is important to consider data durability in the event of disaster when designing an etcd deployment. Deploying etcd on AWS supports multiple mechanisms for data durability.
* replication: etcd replicates all data to all members of the etcd cluster. Therefore, given more members in the cluster and more independent failure domains, the less likely that data stored in an etcd cluster will be permanently lost in the event of disaster.
* Point in time etcd snapshotting: the etcd v3 API introduced support for snapshotting clusters. The operation is cheap enough (completing in the order of minutes) to run quite frequently and the resulting archives can be archived in a storage service like Amazon Simple Storage Service (S3).
* Amazon Elastic Block Storage (EBS): an EBS volume is a replicated network attached block device which have stronger storage safety guarantees than InstanceStore which has a life cycle associated with the life cycle of the attached EC2 instance. The life cycle of an EBS volume is not necessarily tied to an EC2 instance and can be detached and snapshotted independently which means that a single node etcd cluster backed by an EBS volume can provide a fairly reasonable level of data durability.
### Performance/Throughput
The performance of an etcd cluster is roughly quantifiable through latency and throughput metrics which are primarily affected by disk and network performance. Detailed performance planning information is provided in the [performance section](../op-guide/performance.md) of the etcd operations guide.
#### Network
AWS offers EC2 Placement Groups which allow the collocation of EC2 instances within a single Availability Zone which can be utilized in order to minimize network latency between etcd members in the cluster. It is important to remember that collocation of etcd nodes within a single AZ will provide weaker fault tolerance than distributing members across multiple AZs. [Enhanced networking for EC2 instances](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html) may also improve network performance of individual EC2 instances.
#### Disk
AWS provides two basic types of block storage: [EBS volumes](https://aws.amazon.com/ebs/) and [EC2 Instance Store](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html). As mentioned, an EBS volume is a network attached block device while instance storage is directly attached to the hypervisor of the EC2 host. EBS volumes will generally have higher latency, lower throughput, and greater performance variance than Instance Store volumes. If performance, rather than data safety, is the primary concern it is highly recommended that instance storage on the EC2 instances be utilized. Remember that the amount of available instance storage varies by EC2 [instance types](https://aws.amazon.com/ec2/instance-types/) which may impose additional performance considerations.
Inconsistent EBS volume performance can introduce etcd cluster instability. [Provisioned IOPS](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html#EBSVolumeTypes_piops) can provide more consistent performance than general purpose SSD EBS volumes. More information about EBS volume performance is available [from AWS](https://aws.amazon.com/ebs/details/) and Datadog has shared their experience with [getting optimal performance with AWS EBS Provisioned IOPS](https://www.datadoghq.com/blog/aws-ebs-provisioned-iops-getting-optimal-performance/) in their engineering blog.
### Self healing
While using an ASG to scale the size of an etcd cluster is not recommended, an ASG can be used effectively to maintain the desired number of nodes in the event of node failure. The maintenance of a stable number of etcd nodes will provide the etcd cluster with a measure of self healing.
### Next steps
The operational life cycle of an etcd cluster can be greatly simplified through the use of the etcd-operator. The open source etcd operator is a Kubernetes control plane operator which deploys and manages etcd clusters atop Kubernetes. While still in its early stages the etcd-operator already offers periodic backups to S3, detection and replacement of failed nodes, and automated disaster recovery from backups in the event of permanent quorum loss.

View File

@ -0,0 +1,203 @@
# Run etcd on Container Linux with systemd
The following guide shows how to run etcd with [systemd][systemd-docs] under [Container Linux][container-linux-docs].
## Provisioning an etcd cluster
Cluster bootstrapping in Container Linux is simplest with [Ignition][container-linux-ignition]; `coreos-metadata.service` dynamically fetches the machine's IP for discovery. Note that etcd's discovery service protocol is only meant for bootstrapping, and cannot be used with runtime reconfiguration or cluster monitoring.
The [Container Linux Config Transpiler][container-linux-ct] compiles etcd configuration files into Ignition configuration files:
```yaml container-linux-config:norender
etcd:
version: 3.2.0
name: s1
data_dir: /var/lib/etcd
advertise_client_urls: http://{PUBLIC_IPV4}:2379
initial_advertise_peer_urls: http://{PRIVATE_IPV4}:2380
listen_client_urls: http://0.0.0.0:2379
listen_peer_urls: http://{PRIVATE_IPV4}:2380
discovery: https://discovery.etcd.io/<token>
```
`ct` would produce the following Ignition Config:
```
$ ct --platform=gce --in-file /tmp/ct-etcd.cnf
{"ignition":{"version":"2.0.0","config"...
```
```json ignition-config
{
"ignition":{"version":"2.0.0","config":{}},
"storage":{},
"systemd":{
"units":[{
"name":"etcd-member.service",
"enable":true,
"dropins":[{
"name":"20-clct-etcd-member.conf",
"contents":"[Unit]\nRequires=coreos-metadata.service\nAfter=coreos-metadata.service\n\n[Service]\nEnvironmentFile=/run/metadata/coreos\nEnvironment=\"ETCD_IMAGE_TAG=v3.1.8\"\nExecStart=\nExecStart=/usr/lib/coreos/etcd-wrapper $ETCD_OPTS \\\n --name=\"s1\" \\\n --data-dir=\"/var/lib/etcd\" \\\n --listen-peer-urls=\"http://${COREOS_GCE_IP_LOCAL_0}:2380\" \\\n --listen-client-urls=\"http://0.0.0.0:2379\" \\\n --initial-advertise-peer-urls=\"http://${COREOS_GCE_IP_LOCAL_0}:2380\" \\\n --advertise-client-urls=\"http://${COREOS_GCE_IP_EXTERNAL_0}:2379\" \\\n --discovery=\"https://discovery.etcd.io/\u003ctoken\u003e\""}]}]},
"networkd":{},
"passwd":{}}
```
To avoid accidental misconfiguration, the transpiler helpfully verifies etcd configurations when generating Ignition files:
```yaml container-linux-config:norender
etcd:
version: 3.2.0
name: s1
data_dir_x: /var/lib/etcd
advertise_client_urls: http://{PUBLIC_IPV4}:2379
initial_advertise_peer_urls: http://{PRIVATE_IPV4}:2380
listen_client_urls: http://0.0.0.0:2379
listen_peer_urls: http://{PRIVATE_IPV4}:2380
discovery: https://discovery.etcd.io/<token>
```
```
$ ct --platform=gce --in-file /tmp/ct-etcd.cnf
warning at line 3, column 2
Config has unrecognized key: data_dir_x
```
See [Container Linux Provisioning][container-linux-provision] for more details.
## etcd 3.x service
[Container Linux][container-linux-docs] does not include etcd 3.x binaries by default. Different versions of etcd 3.x can be fetched via `etcd-member.service`.
Confirm unit file exists:
```
systemctl cat etcd-member.service
```
Check if the etcd service is running:
```
systemctl status etcd-member.service
```
Example systemd drop-in unit to override the default service settings:
```bash
cat > /tmp/20-cl-etcd-member.conf <<EOF
[Service]
Environment="ETCD_IMAGE_TAG=v3.2.0"
Environment="ETCD_DATA_DIR=/var/lib/etcd"
Environment="ETCD_SSL_DIR=/etc/ssl/certs"
Environment="ETCD_OPTS=--name s1 \
--listen-client-urls https://10.240.0.1:2379 \
--advertise-client-urls https://10.240.0.1:2379 \
--listen-peer-urls https://10.240.0.1:2380 \
--initial-advertise-peer-urls https://10.240.0.1:2380 \
--initial-cluster s1=https://10.240.0.1:2380,s2=https://10.240.0.2:2380,s3=https://10.240.0.3:2380 \
--initial-cluster-token mytoken \
--initial-cluster-state new \
--client-cert-auth \
--trusted-ca-file /etc/ssl/certs/etcd-root-ca.pem \
--cert-file /etc/ssl/certs/s1.pem \
--key-file /etc/ssl/certs/s1-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file /etc/ssl/certs/etcd-root-ca.pem \
--peer-cert-file /etc/ssl/certs/s1.pem \
--peer-key-file /etc/ssl/certs/s1-key.pem \
--auto-compaction-retention 1"
EOF
mv /tmp/20-cl-etcd-member.conf /etc/systemd/system/etcd-member.service.d/20-cl-etcd-member.conf
```
Or use a Container Linux Config:
```yaml container-linux-config:norender
systemd:
units:
- name: etcd-member.service
dropins:
- name: conf1.conf
contents: |
[Service]
Environment="ETCD_SSL_DIR=/etc/ssl/certs"
etcd:
version: 3.2.0
name: s1
data_dir: /var/lib/etcd
listen_client_urls: https://0.0.0.0:2379
advertise_client_urls: https://{PUBLIC_IPV4}:2379
listen_peer_urls: https://{PRIVATE_IPV4}:2380
initial_advertise_peer_urls: https://{PRIVATE_IPV4}:2380
initial_cluster: s1=https://{PRIVATE_IPV4}:2380,s2=https://10.240.0.2:2380,s3=https://10.240.0.3:2380
initial_cluster_token: mytoken
initial_cluster_state: new
client_cert_auth: true
trusted_ca_file: /etc/ssl/certs/etcd-root-ca.pem
cert-file: /etc/ssl/certs/s1.pem
key-file: /etc/ssl/certs/s1-key.pem
peer-client-cert-auth: true
peer-trusted-ca-file: /etc/ssl/certs/etcd-root-ca.pem
peer-cert-file: /etc/ssl/certs/s1.pem
peer-key-file: /etc/ssl/certs/s1-key.pem
auto-compaction-retention: 1
```
```
$ ct --platform=gce --in-file /tmp/ct-etcd.cnf
{"ignition":{"version":"2.0.0","config"...
```
To see all runtime drop-in changes for system units:
```
systemd-delta --type=extended
```
To enable and start:
```
systemctl daemon-reload
systemctl enable --now etcd-member.service
```
To see the logs:
```
journalctl --unit etcd-member.service --lines 10
```
To stop and disable the service:
```
systemctl disable --now etcd-member.service
```
## etcd 2.x service
[Container Linux][container-linux-docs] includes a unit file `etcd2.service` for etcd 2.x, which will be removed in the near future. See [Container Linux FAQ][container-linux-faq] for more details.
Confirm unit file is installed:
```
systemctl cat etcd2.service
```
Check if the etcd service is running:
```
systemctl status etcd2.service
```
To stop and disable:
```
systemctl disable --now etcd2.service
```
[systemd-docs]: https://github.com/systemd/systemd
[container-linux-docs]: https://coreos.com/os/docs/latest
[container-linux-faq]: https://github.com/coreos/docs/blob/master/etcd/os-faq.md
[container-linux-provision]: https://github.com/coreos/docs/blob/master/os/provisioning.md
[container-linux-ignition]: https://github.com/coreos/docs/blob/master/ignition/what-is-ignition.md
[container-linux-ct]: https://github.com/coreos/container-linux-config-transpiler

View File

@ -1,23 +1,18 @@
# FreeBSD
Starting with version 0.1.2 both etcd and etcdctl have been ported to FreeBSD and can
be installed either via packages or ports system. Their versions have been recently
updated to 0.2.0 so now you can enjoy using etcd and etcdctl on FreeBSD 10.0 (RC4 as
of now) and 9.x where they have been tested. They might also work when installed from
ports on earlier versions of FreeBSD, but your mileage may vary.
Starting with version 0.1.2 both etcd and etcdctl have been ported to FreeBSD and can be installed either via packages or ports system. Their versions have been recently updated to 0.2.0 so now etcd and etcdctl can be enjoyed on FreeBSD 10.0 (RC4 as of now) and 9.x, where they have been tested. They might also work when installed from ports on earlier versions of FreeBSD, but it is untested; caveat emptor.
## Installation
### Using pkgng package system
1. If you do not have pkg­ng installed, install it with command `pkg` and answering 'Y'
when asked
1. If pkg­ng is not installed, install it with command `pkg` and answering 'Y' when asked.
2. Update your repository data with `pkg update`
2. Update the repository data with `pkg update`.
3. Install etcd with `pkg install coreos-etcd coreos-etcdctl`
3. Install etcd with `pkg install coreos-etcd coreos-etcdctl`.
4. Verify successful installation with `pkg info | grep etcd` and you should get:
4. Verify successful installation by confirming `pkg info | grep etcd` matches:
```
r@fbsd­10:/ # pkg info | grep etcd
@ -26,21 +21,17 @@ coreos­etcdctl­0.2.0           Simple commandline client for et
r@fbsd­10:/ #
```
5. Youre ready to use etcd and etcdctl! For more information about using pkgng, please
see: http://www.freebsd.org/doc/handbook/pkgng­intro.html
5. etcd and etcdctl are ready to use! For more information about using pkgng, please see: http://www.freebsd.org/doc/handbook/pkgng­intro.html
 
### Using ports system
1. If you do not have ports installed, install with with `portsnap fetch extract` (it
may take some time depending on your hardware and network connection)
1. If ports is not installed, install with `portsnap fetch extract` (it may take some time depending on hardware and network connection).
2. Build etcd with `cd /usr/ports/devel/etcd && make install clean`, you
will get an option to build and install documentation and etcdctl with it.
2. Build etcd with `cd /usr/ports/devel/etcd && make install clean`. There will be an option to build and install documentation and etcdctl with it.
3. If you haven't installed it with etcdctl, and you would like to install it later, you can build it
with `cd /usr/ports/devel/etcdctl && make install clean`
3. If etcd wasn't installed with etcdctl, it can be built later with `cd /usr/ports/devel/etcdctl && make install clean`.
4. Verify successful installation with `pkg info | grep etcd` and you should get:
4. Verify successful installation by confirming `pkg info | grep etcd` matches:
 
```
@ -50,13 +41,8 @@ coreos­etcdctl­0.2.0           Simple commandline client for et
r@fbsd­10:/ #
```
5. Youre ready to use etcd and etcdctl! For more information about using ports system,
please see: https://www.freebsd.org/doc/handbook/ports­using.html
5. etcd and etcdctl are ready to use! For more information about using ports system, please see: https://www.freebsd.org/doc/handbook/ports­using.html
## Issues
If you find any issues with the build/install procedure or you've found a problem that
you've verified is local to FreeBSD version only (for example, by not being able to
reproduce it on any other platform, like OSX or Linux), please sent a
problem report using this page for more
information: http://www.freebsd.org/send­pr.html
If there are any issues with the build/install procedure or there's a problem that is local to FreeBSD only (for example, by not being able to reproduce it on any other platform, like OSX or Linux), please send a problem report using this page for more information: http://www.freebsd.org/send­pr.html

View File

@ -1,6 +1,15 @@
# Production Users
# Production users
This document tracks people and use cases for etcd in production. By creating a list of production use cases we hope to build a community of advisors that we can reach out to with experience using various etcd applications, operation environments, and cluster sizes. The etcd development team may reach out periodically to check-in on your experience and update this list.
This document tracks people and use cases for etcd in production. By creating a list of production use cases we hope to build a community of advisors that we can reach out to with experience using various etcd applications, operation environments, and cluster sizes. The etcd development team may reach out periodically to check-in on how etcd is working in the field and update this list.
## All Kubernetes Users
- *Application*: https://kubernetes.io/
- *Environments*: AWS, OpenStack, Azure, Google Cloud, Huawei Cloud, Bare Metal, etc
**This is a meta user; please feel free to document specific Kubernetes clusters!**
All Kubernetes clusters use etcd as their primary data store. This means etcd's users include such companies as [Niantic, Inc Pokemon Go](https://cloudplatform.googleblog.com/2016/09/bringing-Pokemon-GO-to-life-on-Google-Cloud.html), [Box](https://blog.box.com/blog/kubernetes-box-microservices-maximum-velocity/), [CoreOS](https://coreos.com/tectonic), [Ticketmaster](https://www.youtube.com/watch?v=wqXVKneP0Hg), [Salesforce](https://www.salesforce.com) and many many more.
## discovery.etcd.io
@ -48,4 +57,183 @@ CyCore Systems provides architecture and engineering for computing systems. Thi
Radius Intelligence uses Kubernetes running CoreOS to containerize and scale internal toolsets. Examples include running [JetBrains TeamCity][teamcity] and internal AWS security and cost reporting tools. etcd clusters back these clusters as well as provide some basic environment bootstrapping configuration keys.
## Vonage
- *Application*: kubernetes, vault backend, system configuration for microservices, scheduling, locks (future - service discovery)
- *Launched*: August 2015
- *Cluster Size*: 2 clusters of 5 members in 2 DCs, n local proxies 1-to-1 with microservice, (ssl and SRV look up)
- *Order of Data Size*: kilobytes
- *Operator*: Vonage [devAdmin][raoofm]
- *Environment*: VMWare, AWS
- *Backups*: Daily snapshots on VMs. Backups done for upgrades.
## PD
- *Application*: embed etcd
- *Launched*: Mar 2016
- *Cluster Size*: 3 or 5 members
- *Order of Data Size*: megabytes
- *Operator*: PingCAP, Inc.
- *Environment*: Bare Metal, AWS, etc.
- *Backups*: None.
PD(Placement Driver) is the central controller in the TiDB cluster. It saves the cluster meta information, schedule the data, allocate the global unique timestamp for the distributed transaction, etc. It embeds etcd to supply high availability and auto failover.
## Canal
- *Application*: system configuration for overlay network
- *Launched*: June 2016
- *Cluster Size*: 3 members for each cluster
- *Order of Data Size*: kilobytes
- *Operator*: Huawei Euler Department
- *Environment*: [Huawei Cloud](http://www.hwclouds.com/product/cce.html)
- *Backups*: None, all data can be recreated if necessary.
[teamcity]: https://www.jetbrains.com/teamcity/
[raoofm]:https://github.com/raoofm
## Qiniu Cloud
- *Application*: system configuration for microservices, distributed locks
- *Launched*: Jan. 2016
- *Cluster Size*: 3 members each with several clusters
- *Order of Data Size*: kilobytes
- *Operator*: Pandora, chenchao@qiniu.com
- *Environment*: Baremetal
- *Backups*: None, all data can be recreated if necessary
## QingCloud
- *Application*: [QingCloud][qingcloud] appcenter cluster for service discovery as [metad][metad] backend.
- *Launched*: December 2016
- *Cluster Size*: 1 cluster of 3 members per user.
- *Order of Data Size*: kilobytes
- *Operator*: [yunify][yunify]
- *Environment*: QingCloud IaaS
- *Backups*: None, all data can be recreated if necessary.
[metad]:https://github.com/yunify/metad
[yunify]:https://github.com/yunify
[qingcloud]:https://qingcloud.com/
## Yandex
- *Application*: system configuration for services, service discovery
- *Launched*: March 2016
- *Cluster Size*: 3 clusters of 5 members
- *Order of Data Size*: several gigabytes
- *Operator*: Yandex; [nekto0n][nekto0n]
- *Environment*: Bare Metal
- *Backups*: None
[nekto0n]:https://github.com/nekto0n
## Tencent Games
- *Application*: Meta data and configuration data for service discovery, Kubernetes, etc.
- *Launched*: Jan. 2015
- *Cluster Size*: 3 members each with 10s of clusters
- *Order of Data Size*: 10s of Megabytes
- *Operator*: Tencent Game Operations Department
- *Environment*: Baremetal
- *Backups*: Periodic sync to backup server
In Tencent games, we use Docker and Kubernetes to deploy and run our applications, and use etcd to save meta data for service discovery, Kubernetes, etc.
## Hyper.sh
- *Application*: Kubernetes, distributed locks, etc.
- *Launched*: April 2016
- *Cluster Size*: 1 cluster of 3 members
- *Order of Data Size*: 10s of MB
- *Operator*: Hyper.sh
- *Environment*: Baremetal
- *Backups*: None, all data can be recreated if necessary.
In [hyper.sh][hyper.sh], the container service is backed by [hypernetes][hypernetes], a multi-tenant kubernetes distro. Moreover, we use etcd to coordinate the multiple manage services and store global meta data.
[hypernetes]:https://github.com/hyperhq/hypernetes
[Hyper.sh]:https://www.hyper.sh
## Meitu
- *Application*: system configuration for services, service discovery, kubernetes in test environment
- *Launched*: October 2015
- *Cluster Size*: 1 cluster of 3 members
- *Order of Data Size*: megabytes
- *Operator*: Meitu, hxj@meitu.com, [shafreeck][shafreeck]
- *Environment*: Bare Metal
- *Backups*: None, all data can be recreated if necessary.
[shafreeck]:https://github.com/shafreeck
## Grab
- *Application*: system configuration for services, service discovery
- *Launched*: June 2016
- *Cluster Size*: 1 cluster of 7 members
- *Order of Data Size*: megabytes
- *Operator*: Grab, [taxitan][taxitan], [reterVision][reterVision]
- *Environment*: AWS
- *Backups*: None, all data can be recreated if necessary.
[taxitan]:https://github.com/taxitan
[reterVision]:https://github.com/reterVision
## DaoCloud.io
- *Application*: container management
- *Launched*: Sep. 2015
- *Cluster Size*: 1000+ deployments, each deployment contains a 3 node cluster.
- *Order of Data Size*: 100s of Megabytes
- *Operator*: daocloud.io
- *Environment*: Baremetal and virtual machines
- *Backups*: None, all data can be recreated if necessary.
In [DaoCloud][DaoCloud], we use Docker and Swarm to deploy and run our applications, and we use etcd to save metadata for service discovery.
[DaoCloud]:https://www.daocloud.io
## Branch.io
- *Application*: Kubernetes
- *Launched*: April 2016
- *Cluster Size*: Multiple clusters, multiple sizes
- *Order of Data Size*: 100s of Megabytes
- *Operator*: branch.io
- *Environment*: AWS, Kubernetes
- *Backups*: EBS volume backups
At [Branch][branch], we use kubernetes heavily as our core microservice platform for staging and production.
[branch]: https://branch.io
## Baidu Waimai
- *Application*: SkyDNS, Kubernetes, UDC, CMDB and other distributed systems
- *Launched*: April. 2016
- *Cluster Size*: 3 clusters of 5 members
- *Order of Data Size*: several gigabytes
- *Operator*: Baidu Waimai Operations Department
- *Environment*: CentOS 6.5
- *Backups*: backup scripts
## Salesforce.com
- *Application*: Kubernetes
- *Launched*: Jan 2017
- *Cluster Size*: Multiple clusters of 3 members
- *Order of Data Size*: 100s of Megabytes
- *Operator*: Salesforce.com (krmayankk@github)
- *Environment*: BareMetal
- *Backups*: None, all data can be recreated
## Hosted Graphite
- *Application*: Service discovery, locking, ephemeral application data
- *Launched*: January 2017
- *Cluster Size*: 2 clusters of 7 members
- *Order of Data Size*: Megabytes
- *Operator*: Hosted Graphite (sre@hostedgraphite.com)
- *Environment*: Bare Metal
- *Backups*: None, all data is considered ephemeral.

View File

@ -1,24 +1,24 @@
# Reporting Bugs
# Reporting bugs
If you find bugs or documentation mistakes in the etcd project, please let us know by [opening an issue][issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
If any part of the etcd project has bugs or documentation mistakes, please let us know by [opening an issue][etcd-issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
To make your bug report accurate and easy to understand, please try to create bug reports that are:
To make the bug report accurate and easy to understand, please try to create bug reports that are:
- Specific. Include as much details as possible: which version, what environment, what configuration, etc. You can also attach etcd log (the starting log with etcd configuration is especially important).
- Specific. Include as much details as possible: which version, what environment, what configuration, etc. If the bug is related to running the etcd server, please attach the etcd log (the starting log with etcd configuration is especially important).
- Reproducible. Include the steps to reproduce the problem. We understand some issues might be hard to reproduce, please includes the steps that might lead to the problem. You can also attach the affected etcd data dir and stack strace to the bug report.
- Reproducible. Include the steps to reproduce the problem. We understand some issues might be hard to reproduce, please includes the steps that might lead to the problem. If possible, please attach the affected etcd data dir and stack strace to the bug report.
- Isolated. Please try to isolate and reproduce the bug with minimum dependencies. It would significantly slow down the speed to fix a bug if too many dependencies are involved in a bug report. Debugging external systems that rely on etcd is out of scope, but we are happy to point you in the right direction or help you interact with etcd in the correct manner.
- Isolated. Please try to isolate and reproduce the bug with minimum dependencies. It would significantly slow down the speed to fix a bug if too many dependencies are involved in a bug report. Debugging external systems that rely on etcd is out of scope, but we are happy to provide guidance in the right direction or help with using etcd itself.
- Unique. Do not duplicate existing bug report.
- Scoped. One bug per report. Do not follow up with another bug inside one report.
You might also want to read [Elika Etemads article on filing good bug reports][filing-good-bugs] before creating a bug report.
It may be worthwhile to read [Elika Etemads article on filing good bug reports][filing-good-bugs] before creating a bug report.
We might ask you for further information to locate a bug. A duplicated bug report will be closed.
We might ask for further information to locate a bug. A duplicated bug report will be closed.
## Frequently Asked Questions
## Frequently asked questions
### How to get a stack trace
@ -39,7 +39,7 @@ $ sudo systemctl cat etcd2
$ sudo journalctl -u etcd2
```
Due to an upstream systemd bug, journald may miss the last few log lines when its process exit. If journalctl tells you that etcd stops without fatal or panic message, you could try `sudo journalctl -f -t etcd2` to get full log.
Due to an upstream systemd bug, journald may miss the last few log lines when its processes exit. If journalctl says etcd stopped without fatal or panic message, try `sudo journalctl -f -t etcd2` to get full log.
[etcd-issue]: https://github.com/coreos/etcd/issues/new
[filing-good-bugs]: http://fantasai.inkedblade.net/style/talks/filing-good-bugs/

View File

@ -208,4 +208,4 @@ WatchResponse {
```
[api-protobuf]: https://github.com/coreos/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto
[kv-protobuf]: https://github.com/coreos/etcd/blob/master/storage/storagepb/kv.proto
[kv-protobuf]: https://github.com/coreos/etcd/blob/master/mvcc/mvccpb/kv.proto

View File

@ -1,47 +1,29 @@
# Tuning
The default settings in etcd should work well for installations on a local network where the average network latency is low.
However, when using etcd across multiple data centers or over networks with high latency you may need to tweak the heartbeat interval and election timeout settings.
The default settings in etcd should work well for installations on a local network where the average network latency is low. However, when using etcd across multiple data centers or over networks with high latency, the heartbeat interval and election timeout settings may need tuning.
The network isn't the only source of latency. Each request and response may be impacted by slow disks on both the leader and follower. Each of these timeouts represents the total time from request to successful response from the other machine.
## Time Parameters
## Time parameters
The underlying distributed consensus protocol relies on two separate time parameters to ensure that nodes can handoff leadership if one stalls or goes offline.
The first parameter is called the *Heartbeat Interval*.
This is the frequency with which the leader will notify followers that it is still the leader.
For best practices, the parameter should be set around round-trip time between members.
By default, etcd uses a `100ms` heartbeat interval.
The underlying distributed consensus protocol relies on two separate time parameters to ensure that nodes can handoff leadership if one stalls or goes offline. The first parameter is called the *Heartbeat Interval*. This is the frequency with which the leader will notify followers that it is still the leader.
For best practices, the parameter should be set around round-trip time between members. By default, etcd uses a `100ms` heartbeat interval.
The second parameter is the *Election Timeout*.
This timeout is how long a follower node will go without hearing a heartbeat before attempting to become leader itself.
By default, etcd uses a `1000ms` election timeout.
The second parameter is the *Election Timeout*. This timeout is how long a follower node will go without hearing a heartbeat before attempting to become leader itself. By default, etcd uses a `1000ms` election timeout.
Adjusting these values is a trade off.
The value of heartbeat interval is recommended to be around the maximum of average round-trip time (RTT) between members, normally around 0.5-1.5x the round-trip time.
If heartbeat interval is too low, etcd will send unnecessary messages that increase the usage of CPU and network resources.
On the other side, a too high heartbeat interval leads to high election timeout. Higher election timeout takes longer time to detect a leader failure.
The easiest way to measure round-trip time (RTT) is to use [PING utility][ping].
Adjusting these values is a trade off. The value of heartbeat interval is recommended to be around the maximum of average round-trip time (RTT) between members, normally around 0.5-1.5x the round-trip time. If heartbeat interval is too low, etcd will send unnecessary messages that increase the usage of CPU and network resources. On the other side, a too high heartbeat interval leads to high election timeout. Higher election timeout takes longer time to detect a leader failure. The easiest way to measure round-trip time (RTT) is to use [PING utility][ping].
The election timeout should be set based on the heartbeat interval and average round-trip time between members.
Election timeouts must be at least 10 times the round-trip time so it can account for variance in your network.
For example, if the round-trip time between your members is 10ms then you should have at least a 100ms election timeout.
The election timeout should be set based on the heartbeat interval and average round-trip time between members. Election timeouts must be at least 10 times the round-trip time so it can account for variance in the network. For example, if the round-trip time between members is 10ms then the election timeout should be at least 100ms.
You should also set your election timeout to at least 5 to 10 times your heartbeat interval to account for variance in leader replication.
For a heartbeat interval of 50ms you should set your election timeout to at least 250ms - 500ms.
The upper limit of election timeout is 50000ms (50s), which should only be used when deploying a globally-distributed etcd cluster.
A reasonable round-trip time for the continental United States is 130ms, and the time between US and Japan is around 350-400ms.
If your network has uneven performance or regular packet delays/loss then it is possible that a couple of retries may be necessary to successfully send a packet. So 5s is a safe upper limit of global round-trip time.
As the election timeout should be an order of magnitude bigger than broadcast time, in the case of ~5s for a globally distributed cluster, then 50 seconds becomes a reasonable maximum.
The upper limit of election timeout is 50000ms (50s), which should only be used when deploying a globally-distributed etcd cluster. A reasonable round-trip time for the continental United States is 130ms, and the time between US and Japan is around 350-400ms. If the network has uneven performance or regular packet delays/loss then it is possible that a couple of retries may be necessary to successfully send a packet. So 5s is a safe upper limit of global round-trip time. As the election timeout should be an order of magnitude bigger than broadcast time, in the case of ~5s for a globally distributed cluster, then 50 seconds becomes a reasonable maximum.
The heartbeat interval and election timeout value should be the same for all members in one cluster. Setting different values for etcd members may disrupt cluster stability.
You can override the default values on the command line:
The default values can be overridden on the command line:
```sh
# Command line arguments:
$ etcd -heartbeat-interval=100 -election-timeout=500
$ etcd --heartbeat-interval=100 --election-timeout=500
# Environment variables:
$ ETCD_HEARTBEAT_INTERVAL=100 ETCD_ELECTION_TIMEOUT=500 etcd
@ -51,25 +33,50 @@ The values are specified in milliseconds.
## Snapshots
etcd appends all key changes to a log file.
This log grows forever and is a complete linear history of every change made to the keys.
A complete history works well for lightly used clusters but clusters that are heavily used would carry around a large log.
etcd appends all key changes to a log file. This log grows forever and is a complete linear history of every change made to the keys. A complete history works well for lightly used clusters but clusters that are heavily used would carry around a large log.
To avoid having a huge log etcd makes periodic snapshots.
These snapshots provide a way for etcd to compact the log by saving the current state of the system and removing old logs.
To avoid having a huge log etcd makes periodic snapshots. These snapshots provide a way for etcd to compact the log by saving the current state of the system and removing old logs.
### Snapshot Tuning
### Snapshot tuning
Creating snapshots can be expensive so they're only created after a given number of changes to etcd.
By default, snapshots will be made after every 10,000 changes.
If etcd's memory usage and disk usage are too high, you can lower the snapshot threshold by setting the following on the command line:
Creating snapshots with the V2 backend can be expensive, so snapshots are only created after a given number of changes to etcd. By default, snapshots will be made after every 10,000 changes. If etcd's memory usage and disk usage are too high, try lowering the snapshot threshold by setting the following on the command line:
```sh
# Command line arguments:
$ etcd -snapshot-count=5000
$ etcd --snapshot-count=5000
# Environment variables:
$ ETCD_SNAPSHOT_COUNT=5000 etcd
```
## Disk
An etcd cluster is very sensitive to disk latencies. Since etcd must persist proposals to its log, disk activity from other processes may cause long `fsync` latencies. The upshot is etcd may miss heartbeats, causing request timeouts and temporary leader loss. An etcd server can sometimes stably run alongside these processes when given a high disk priority.
On Linux, etcd's disk priority can be configured with `ionice`:
```sh
# best effort, highest priority
$ sudo ionice -c2 -n0 -p `pgrep etcd`
```
## Network
If the etcd leader serves a large number of concurrent client requests, it may delay processing follower peer requests due to network congestion. This manifests as send buffer error messages on the follower nodes:
```
dropped MsgProp to 247ae21ff9436b2d since streamMsg's sending buffer is full
dropped MsgAppResp to 247ae21ff9436b2d since streamMsg's sending buffer is full
```
These errors may be resolved by prioritizing etcd's peer traffic over its client traffic. On Linux, peer traffic can be prioritized by using the traffic control mechanism:
```
tc qdisc add dev eth0 root handle 1: prio bands 3
tc filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip sport 2380 0xffff flowid 1:1
tc filter add dev eth0 parent 1: protocol ip prio 1 u32 match ip dport 2380 0xffff flowid 1:1
tc filter add dev eth0 parent 1: protocol ip prio 2 u32 match ip sport 2739 0xffff flowid 1:1
tc filter add dev eth0 parent 1: protocol ip prio 2 u32 match ip dport 2739 0xffff flowid 1:1
```
[ping]: https://en.wikipedia.org/wiki/Ping_(networking_utility)

View File

@ -0,0 +1,129 @@
## Upgrade etcd from 2.3 to 3.0
In the general case, upgrading from etcd 2.3 to 3.0 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v2.3 processes and replace them with etcd v3.0 processes
- after running all v3.0 processes, new features in v3.0 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade checklists
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.0, the running cluster must be 2.3 or greater. If it's before 2.3, please upgrade to [2.3](https://github.com/coreos/etcd/releases/tag/v2.3.0) before upgrading to 3.0.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl cluster-health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data directory](../v2/admin_guide.md#backing-up-the-datastore). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version.
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.0. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
It might take up to 2 minutes for the newly upgraded member to catch up with the existing cluster when the total data size is larger than 50MB. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and well be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.0, the cluster will be upgraded to v3.0, and downgrade from this completed state is **not possible**. If any single member is still v2.3, however, the cluster and its operations remains “v2.3”, and it is possible from this mixed cluster state to return to using a v2.3 etcd binary on all members.
Please [backup the data directory](../v2/admin_guide.md#backing-up-the-datastore) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example details the upgrade of a three-member v2.3 ectd cluster running on a local machine.
#### 1. Check upgrade requirements.
Is the cluster healthy and running v.2.3.x?
```
$ etcdctl cluster-health
member 6e3bd23ae5f1eae0 is healthy: got healthy result from http://localhost:22379
member 924e2e83e93f2560 is healthy: got healthy result from http://localhost:32379
member 8211f1d0f64f3269 is healthy: got healthy result from http://localhost:12379
cluster is healthy
$ curl http://localhost:2379/version
{"etcdserver":"2.3.x","etcdcluster":"2.3.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
2016-06-27 15:21:48.624124 E | rafthttp: failed to dial 8211f1d0f64f3269 on stream Message (dial tcp 127.0.0.1:12380: getsockopt: connection refused)
2016-06-27 15:21:48.624175 I | rafthttp: the connection with 8211f1d0f64f3269 became inactive
```
Its a good idea at this point to [backup the etcd data directory](../v2/admin_guide.md#backing-up-the-datastore) to provide a downgrade path should any problems occur:
```
$ etcdctl backup \
--data-dir /var/lib/etcd \
--backup-dir /tmp/etcd_backup
```
#### 3. Drop-in etcd v3.0 binary and start the new etcd process
The new v3.0 etcd will publish its information to the cluster:
```
09:58:25.938673 I | etcdserver: published {Name:infra1 ClientURLs:[http://localhost:12379]} to cluster 524400597fb1d5f6
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.0 etcd binary:
```
$ etcdctl cluster-health
member 6e3bd23ae5f1eae0 is healthy: got healthy result from http://localhost:22379
member 924e2e83e93f2560 is healthy: got healthy result from http://localhost:32379
member 8211f1d0f64f3269 is healthy: got healthy result from http://localhost:12379
cluster is healthy
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.0:
```
2016-06-27 15:22:05.679644 W | etcdserver: the local etcd version 2.3.7 is not up-to-date
2016-06-27 15:22:05.679660 W | etcdserver: member 8211f1d0f64f3269 has a higher version 3.0.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.0 successfully:
```
2016-06-27 15:22:19.873751 N | membership: updated the cluster version from 2.3 to 3.0
2016-06-27 15:22:19.914574 I | api: enabled capabilities for version 3.0.0
```
```
$ ETCDCTL_API=3 etcdctl endpoint health
127.0.0.1:12379 is healthy: successfully committed proposal: took = 18.440155ms
127.0.0.1:32379 is healthy: successfully committed proposal: took = 13.651368ms
127.0.0.1:22379 is healthy: successfully committed proposal: took = 18.513301ms
```
## Further considerations
- etcdctl environment variables have been updated. If `ETCDCTL_API=2 etcdctl cluster-health` works properly but `ETCDCTL_API=3 etcdctl endpoints health` responds with `Error: grpc: timed out when dialing`, be sure to use the [new variable names](https://github.com/coreos/etcd/tree/master/etcdctl#etcdctl).
## Known Issues
- etcd &lt; v3.1 does not work properly if built with Go &gt; v1.7. See [Issue 6951](https://github.com/coreos/etcd/issues/6951) for additional information.
- If an error such as `transport: http2Client.notifyError got notified that the client transport was broken unexpected EOF.` shows up in the etcd server logs, be sure etcd is a pre-built release or built with (etcd v3.1+ &amp; go v1.7+) or (etcd &lt;v3.1 &amp; go v1.6.x).
- Adding a v3 node to v2.3 cluster during upgrades is not supported and could trigger panics. See [Issue 7249](https://github.com/coreos/etcd/issues/7429) for additional information. Mixed versions of etcd members are only allowed during v3 migration. Finish upgrades before making any membership changes.
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,123 @@
## Upgrade etcd from 3.0 to 3.1
In the general case, upgrading from etcd 3.0 to 3.1 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v3.0 processes and replace them with etcd v3.1 processes
- after running all v3.1 processes, new features in v3.1 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade checklists
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.1, the running cluster must be 3.0 or greater. If it's before 3.0, please upgrade to [3.0](https://github.com/coreos/etcd/releases/tag/v3.0.16) before upgrading to 3.1.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl endpoint health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data](../op-guide/maintenance.md#snapshot-backup). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version. Please note that the `snapshot` command only backs up the v3 data. For v2 data, see [backing up v2 datastore](../v2/admin_guide.md#backing-up-the-datastore).
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.1. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
If the cluster is serving a v2 data set larger than 50MB, each newly upgraded member may take up to two minutes to catch up with the existing cluster. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and we'll be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.1, the cluster will be upgraded to v3.1, and downgrade from this completed state is **not possible**. If any single member is still v3.0, however, the cluster and its operations remains "v3.0", and it is possible from this mixed cluster state to return to using a v3.0 etcd binary on all members.
Please [backup the data directory](../op-guide/maintenance.md#snapshot-backup) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example shows how to upgrade a 3-member v3.0 ectd cluster running on a local machine.
#### 1. Check upgrade requirements
Is the cluster healthy and running v3.0.x?
```
$ ETCDCTL_API=3 etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 6.600684ms
localhost:22379 is healthy: successfully committed proposal: took = 8.540064ms
localhost:32379 is healthy: successfully committed proposal: took = 8.763432ms
$ curl http://localhost:2379/version
{"etcdserver":"3.0.16","etcdcluster":"3.0.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
2017-01-17 09:34:18.352662 I | raft: raft.node: 1640829d9eea5cfb elected leader 1640829d9eea5cfb at term 5
2017-01-17 09:34:18.359630 W | etcdserver: failed to reach the peerURL(http://localhost:2380) of member fd32987dcd0511e0 (Get http://localhost:2380/version: dial tcp 127.0.0.1:2380: getsockopt: connection refused)
2017-01-17 09:34:18.359679 W | etcdserver: cannot get the version of member fd32987dcd0511e0 (Get http://localhost:2380/version: dial tcp 127.0.0.1:2380: getsockopt: connection refused)
2017-01-17 09:34:18.548116 W | rafthttp: lost the TCP streaming connection with peer fd32987dcd0511e0 (stream Message writer)
2017-01-17 09:34:19.147816 W | rafthttp: lost the TCP streaming connection with peer fd32987dcd0511e0 (stream MsgApp v2 writer)
2017-01-17 09:34:34.364907 W | etcdserver: failed to reach the peerURL(http://localhost:2380) of member fd32987dcd0511e0 (Get http://localhost:2380/version: dial tcp 127.0.0.1:2380: getsockopt: connection refused)
```
It's a good idea at this point to [backup the etcd data](../op-guide/maintenance.md#snapshot-backup) to provide a downgrade path should any problems occur:
```
$ etcdctl snapshot save backup.db
```
#### 3. Drop-in etcd v3.1 binary and start the new etcd process
The new v3.1 etcd will publish its information to the cluster:
```
2017-01-17 09:36:00.996590 I | etcdserver: published {Name:my-etcd-1 ClientURLs:[http://localhost:2379]} to cluster 46bc3ce73049e678
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.1 etcd binary:
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:22379 is healthy: successfully committed proposal: took = 5.540129ms
localhost:32379 is healthy: successfully committed proposal: took = 7.321671ms
localhost:2379 is healthy: successfully committed proposal: took = 10.629901ms
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.1:
```
2017-01-17 09:36:38.406268 W | etcdserver: the local etcd version 3.0.16 is not up-to-date
2017-01-17 09:36:38.406295 W | etcdserver: member fd32987dcd0511e0 has a higher version 3.1.0
2017-01-17 09:36:42.407695 W | etcdserver: the local etcd version 3.0.16 is not up-to-date
2017-01-17 09:36:42.407730 W | etcdserver: member fd32987dcd0511e0 has a higher version 3.1.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.1 successfully:
```
2017-01-17 09:37:03.100015 I | etcdserver: updating the cluster version from 3.0 to 3.1
2017-01-17 09:37:03.104263 N | etcdserver/membership: updated the cluster version from 3.0 to 3.1
2017-01-17 09:37:03.104374 I | etcdserver/api: enabled capabilities for version 3.1
```
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 2.312897ms
localhost:22379 is healthy: successfully committed proposal: took = 2.553476ms
localhost:32379 is healthy: successfully committed proposal: took = 2.516902ms
```
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,172 @@
## Upgrade etcd from 3.1 to 3.2
In the general case, upgrading from etcd 3.1 to 3.2 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v3.1 processes and replace them with etcd v3.2 processes
- after running all v3.2 processes, new features in v3.2 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Client upgrade checklists
3.2 introduces two breaking changes.
Previously, `clientv3.Lease.TimeToLive` API returned `lease.ErrLeaseNotFound` on non-existent lease ID. 3.2 instead returns TTL=-1 in its response and no error (see [#7305](https://github.com/coreos/etcd/pull/7305)).
Before
```go
// when leaseID does not exist
resp, err := TimeToLive(ctx, leaseID)
resp == nil
err == lease.ErrLeaseNotFound
```
After
```go
// when leaseID does not exist
resp, err := TimeToLive(ctx, leaseID)
resp.TTL == -1
err == nil
```
`clientv3.NewFromConfigFile` is moved to `yaml.NewConfig`.
Before
```go
import "github.com/coreos/etcd/clientv3"
clientv3.NewFromConfigFile
```
After
```go
import clientv3yaml "github.com/coreos/etcd/clientv3/yaml"
clientv3yaml.NewConfig
```
### Server upgrade checklists
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.2, the running cluster must be 3.1 or greater. If it's before 3.1, please upgrade to [3.1](https://github.com/coreos/etcd/releases/tag/v3.1.7) before upgrading to 3.2.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl endpoint health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data](../op-guide/maintenance.md#snapshot-backup). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version. Please note that the `snapshot` command only backs up the v3 data. For v2 data, see [backing up v2 datastore](../v2/admin_guide.md#backing-up-the-datastore).
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.2. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
If the cluster is serving a v2 data set larger than 50MB, each newly upgraded member may take up to two minutes to catch up with the existing cluster. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and we'll be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.2, the cluster will be upgraded to v3.2, and downgrade from this completed state is **not possible**. If any single member is still v3.1, however, the cluster and its operations remains "v3.1", and it is possible from this mixed cluster state to return to using a v3.1 etcd binary on all members.
Please [backup the data directory](../op-guide/maintenance.md#snapshot-backup) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example shows how to upgrade a 3-member v3.1 ectd cluster running on a local machine.
#### 1. Check upgrade requirements
Is the cluster healthy and running v3.1.x?
```
$ ETCDCTL_API=3 etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 6.600684ms
localhost:22379 is healthy: successfully committed proposal: took = 8.540064ms
localhost:32379 is healthy: successfully committed proposal: took = 8.763432ms
$ curl http://localhost:2379/version
{"etcdserver":"3.1.7","etcdcluster":"3.1.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
2017-04-27 14:13:31.491746 I | raft: c89feb932daef420 [term 3] received MsgTimeoutNow from 6d4f535bae3ab960 and starts an election to get leadership.
2017-04-27 14:13:31.491769 I | raft: c89feb932daef420 became candidate at term 4
2017-04-27 14:13:31.491788 I | raft: c89feb932daef420 received MsgVoteResp from c89feb932daef420 at term 4
2017-04-27 14:13:31.491797 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 6d4f535bae3ab960 at term 4
2017-04-27 14:13:31.491805 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 9eda174c7df8a033 at term 4
2017-04-27 14:13:31.491815 I | raft: raft.node: c89feb932daef420 lost leader 6d4f535bae3ab960 at term 4
2017-04-27 14:13:31.524084 I | raft: c89feb932daef420 received MsgVoteResp from 6d4f535bae3ab960 at term 4
2017-04-27 14:13:31.524108 I | raft: c89feb932daef420 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
2017-04-27 14:13:31.524123 I | raft: c89feb932daef420 became leader at term 4
2017-04-27 14:13:31.524136 I | raft: raft.node: c89feb932daef420 elected leader c89feb932daef420 at term 4
2017-04-27 14:13:31.592650 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream MsgApp v2 reader)
2017-04-27 14:13:31.592825 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message reader)
2017-04-27 14:13:31.693275 E | rafthttp: failed to dial 6d4f535bae3ab960 on stream Message (dial tcp [::1]:2380: getsockopt: connection refused)
2017-04-27 14:13:31.693289 I | rafthttp: peer 6d4f535bae3ab960 became inactive
2017-04-27 14:13:31.936678 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message writer)
```
It's a good idea at this point to [backup the etcd data](../op-guide/maintenance.md#snapshot-backup) to provide a downgrade path should any problems occur:
```
$ etcdctl snapshot save backup.db
```
#### 3. Drop-in etcd v3.2 binary and start the new etcd process
The new v3.2 etcd will publish its information to the cluster:
```
2017-04-27 14:14:25.363225 I | etcdserver: published {Name:s1 ClientURLs:[http://localhost:2379]} to cluster a9ededbffcb1b1f1
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.2 etcd binary:
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:22379 is healthy: successfully committed proposal: took = 5.540129ms
localhost:32379 is healthy: successfully committed proposal: took = 7.321771ms
localhost:2379 is healthy: successfully committed proposal: took = 10.629901ms
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.2:
```
2017-04-27 14:15:17.071804 W | etcdserver: member c89feb932daef420 has a higher version 3.2.0
2017-04-27 14:15:21.073110 W | etcdserver: the local etcd version 3.1.7 is not up-to-date
2017-04-27 14:15:21.073142 W | etcdserver: member 6d4f535bae3ab960 has a higher version 3.2.0
2017-04-27 14:15:21.073157 W | etcdserver: the local etcd version 3.1.7 is not up-to-date
2017-04-27 14:15:21.073164 W | etcdserver: member c89feb932daef420 has a higher version 3.2.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.2 successfully:
```
2017-04-27 14:15:54.536901 N | etcdserver/membership: updated the cluster version from 3.1 to 3.2
2017-04-27 14:15:54.537035 I | etcdserver/api: enabled capabilities for version 3.2
```
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 2.312897ms
localhost:22379 is healthy: successfully committed proposal: took = 2.553476ms
localhost:32379 is healthy: successfully committed proposal: took = 2.517902ms
```
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

165
Documentation/v2/README.md Normal file
View File

@ -0,0 +1,165 @@
# etcd2
[![Go Report Card](https://goreportcard.com/badge/github.com/coreos/etcd)](https://goreportcard.com/report/github.com/coreos/etcd)
[![Build Status](https://travis-ci.org/coreos/etcd.svg?branch=master)](https://travis-ci.org/coreos/etcd)
[![Build Status](https://semaphoreci.com/api/v1/coreos/etcd/branches/master/shields_badge.svg)](https://semaphoreci.com/coreos/etcd)
[![Docker Repository on Quay.io](https://quay.io/repository/coreos/etcd-git/status "Docker Repository on Quay.io")](https://quay.io/repository/coreos/etcd-git)
**Note**: The `master` branch may be in an *unstable or even broken state* during development. Please use [releases][github-release] instead of the `master` branch in order to get stable binaries.
![etcd Logo](../../logos/etcd-horizontal-color.png)
etcd is a distributed, consistent key-value store for shared configuration and service discovery, with a focus on being:
* *Simple*: curl'able user-facing API (HTTP+JSON)
* *Secure*: optional SSL client cert authentication
* *Fast*: benchmarked 1000s of writes/s per instance
* *Reliable*: properly distributed using Raft
etcd is written in Go and uses the [Raft][raft] consensus algorithm to manage a highly-available replicated log.
etcd is used [in production by many companies](./production-users.md), and the development team stands behind it in critical deployment scenarios, where etcd is frequently teamed with applications such as [Kubernetes][k8s], [fleet][fleet], [locksmith][locksmith], [vulcand][vulcand], and many others.
See [etcdctl][etcdctl] for a simple command line client.
Or feel free to just use `curl`, as in the examples below.
[raft]: https://raft.github.io/
[k8s]: http://kubernetes.io/
[fleet]: https://github.com/coreos/fleet
[locksmith]: https://github.com/coreos/locksmith
[vulcand]: https://github.com/vulcand/vulcand
[etcdctl]: https://github.com/coreos/etcd/tree/master/etcdctl
## Getting Started
### Getting etcd
The easiest way to get etcd is to use one of the pre-built release binaries which are available for OSX, Linux, Windows, AppC (ACI), and Docker. Instructions for using these binaries are on the [GitHub releases page][github-release].
For those wanting to try the very latest version, you can build the latest version of etcd from the `master` branch.
You will first need [*Go*](https://golang.org/) installed on your machine (version 1.5+ is required).
All development occurs on `master`, including new features and bug fixes.
Bug fixes are first targeted at `master` and subsequently ported to release branches, as described in the [branch management][branch-management] guide.
[github-release]: https://github.com/coreos/etcd/releases/
[branch-management]: branch_management.md
### Running etcd
First start a single-member cluster of etcd:
```sh
./bin/etcd
```
This will bring up etcd listening on port 2379 for client communication and on port 2380 for server-to-server communication.
Next, let's set a single key, and then retrieve it:
```
curl -L http://127.0.0.1:2379/v2/keys/mykey -XPUT -d value="this is awesome"
curl -L http://127.0.0.1:2379/v2/keys/mykey
```
You have successfully started an etcd and written a key to the store.
### etcd TCP ports
The [official etcd ports][iana-ports] are 2379 for client requests, and 2380 for peer communication. To maintain compatibility, some etcd configuration and documentation continues to refer to the legacy ports 4001 and 7001, but all new etcd use and discussion should adopt the IANA-assigned ports. The legacy ports 4001 and 7001 will be fully deprecated, and support for their use removed, in future etcd releases.
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
### Running local etcd cluster
First install [goreman](https://github.com/mattn/goreman), which manages Procfile-based applications.
Our [Procfile script](../../V2Procfile) will set up a local example cluster. You can start it with:
```sh
goreman start
```
This will bring up 3 etcd members `infra1`, `infra2` and `infra3` and etcd proxy `proxy`, which runs locally and composes a cluster.
You can write a key to the cluster and retrieve the value back from any member or proxy.
### Next Steps
Now it's time to dig into the full etcd API and other guides.
- Explore the full [API][api].
- Set up a [multi-machine cluster][clustering].
- Learn the [config format, env variables and flags][configuration].
- Find [language bindings and tools][libraries-and-tools].
- Use TLS to [secure an etcd cluster][security].
- [Tune etcd][tuning].
- [Upgrade from 0.4.9+ to 2.2.0][upgrade].
[api]: ./api.md
[clustering]: ./clustering.md
[configuration]: ./configuration.md
[libraries-and-tools]: ./libraries-and-tools.md
[security]: ./security.md
[tuning]: ./tuning.md
[upgrade]: ./04_to_2_snapshot_migration.md
## Contact
- Mailing list: [etcd-dev](https://groups.google.com/forum/?hl=en#!forum/etcd-dev)
- IRC: #[etcd](irc://irc.freenode.org:6667/#etcd) on freenode.org
- Planning/Roadmap: [milestones](https://github.com/coreos/etcd/milestones), [roadmap](../../ROADMAP.md)
- Bugs: [issues](https://github.com/coreos/etcd/issues)
## Contributing
See [CONTRIBUTING](../../CONTRIBUTING.md) for details on submitting patches and the contribution workflow.
## Reporting bugs
See [reporting bugs](reporting_bugs.md) for details about reporting any issue you may encounter.
## Known bugs
[GH518](https://github.com/coreos/etcd/issues/518) is a known bug. Issue is that:
```
curl http://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar
curl http://127.0.0.1:2379/v2/keys/foo -XPUT -d dir=true -d prevExist=true
```
If the previous node is a key and client tries to overwrite it with `dir=true`, it does not give warnings such as `Not a directory`. Instead, the key is set to empty value.
## Project Details
### Versioning
#### Service Versioning
etcd uses [semantic versioning](http://semver.org)
New minor versions may add additional features to the API.
You can get the version of etcd by issuing a request to /version:
```sh
curl -L http://127.0.0.1:2379/version
```
#### API Versioning
The `v2` API responses should not change after the 2.0.0 release but new features will be added over time.
#### 32-bit and other unsupported systems
etcd has known issues on 32-bit systems due to a bug in the Go runtime. See #[358][358] for more information.
To avoid inadvertently running a possibly unstable etcd server, `etcd` on unsupported architectures will print
a warning message and immediately exit if the environment variable `ETCD_UNSUPPORTED_ARCH` is not set to
the target architecture.
Currently only the amd64 architecture is officially supported by `etcd`.
[358]: https://github.com/coreos/etcd/issues/358
### License
etcd is under the Apache 2.0 license. See the [LICENSE](../../LICENSE) file for details.

View File

@ -113,7 +113,8 @@ It is recommended to have an odd number of members in a cluster. Having an odd c
| Cluster Size | Majority | Failure Tolerance |
|--------------|------------|-------------------|
| 1 | 1 | 0 |
| 3 | 2 | 1 |
| 2 | 2 | 0 |
| 3 | 2 | **1** |
| 4 | 3 | 1 |
| 5 | 3 | **2** |
| 6 | 4 | 2 |
@ -135,7 +136,7 @@ The data directory contains all the data to recover a member to its point-in-tim
* Stop the member process.
* Copy the data directory of the now-idle member to the new machine.
* Update the peer URLs for the replaced member to reflect the new machine according to the [runtime reconfiguration instructions][update-member].
* Update the peer URLs for the replaced member to reflect the new machine according to the [runtime reconfiguration instructions][update-a-member].
* Start etcd on the new machine, using the same configuration and the copy of the data directory.
This example will walk you through the process of migrating the infra1 member to a new machine:
@ -215,14 +216,16 @@ To recover from such scenarios, etcd provides functionality to backup and restor
#### Backing up the datastore
**NB:** Windows users must stop etcd before running the backup command.
**Note:** Windows users must stop etcd before running the backup command.
The first step of the recovery is to backup the data directory on a functioning etcd node. To do this, use the `etcdctl backup` command, passing in the original data directory used by etcd. For example:
The first step of the recovery is to backup the data directory and wal directory, if stored separately, on a functioning etcd node. To do this, use the `etcdctl backup` command, passing in the original data (and wal) directory used by etcd. For example:
```sh
etcdctl backup \
--data-dir %data_dir% \
[--wal-dir %wal_dir%] \
--backup-dir %backup_data_dir%
[--backup-wal-dir %backup_wal_dir%]
```
This command will rewrite some of the metadata contained in the backup (specifically, the node ID and cluster ID), which means that the node will lose its former identity. In order to recreate a cluster from the backup, you will need to start a new, single-node cluster. The metadata is rewritten to prevent the new node from inadvertently being joined onto an existing cluster.
@ -234,28 +237,34 @@ To restore a backup using the procedure created above, start etcd with the `-for
```sh
etcd \
-data-dir=%backup_data_dir% \
[-wal-dir=%backup_wal_dir%] \
-force-new-cluster \
...
```
Now etcd should be available on this node and serving the original datastore.
Once you have verified that etcd has started successfully, shut it down and move the data back to the previous location (you may wish to make another copy as well to be safe):
Once you have verified that etcd has started successfully, shut it down and move the data and wal, if stored separately, back to the previous location (you may wish to make another copy as well to be safe):
```sh
pkill etcd
rm -fr %data_dir%
rm -fr %wal_dir%
mv %backup_data_dir% %data_dir%
mv %backup_wal_dir% %wal_dir%
etcd \
-data-dir=%data_dir% \
[-wal-dir=%wal_dir%] \
...
```
#### Restoring the cluster
Now that the node is running successfully, [change its advertised peer URLs][update-member], as the `--force-new-cluster` option has set the peer URL to the default listening on localhost.
Now that the node is running successfully, [change its advertised peer URLs][update-a-member], as the `--force-new-cluster` option has set the peer URL to the default listening on localhost.
You can then add more nodes to the cluster and restore resiliency. See the [add a new member][add-a-member] guide for more details. **NB:** If you are trying to restore your cluster using old failed etcd nodes, please make sure you have stopped old etcd instances and removed their old data directories specified by the data-dir configuration parameter.
You can then add more nodes to the cluster and restore resiliency. See the [add a new member][add-a-member] guide for more details.
**Note:** If you are trying to restore your cluster using old failed etcd nodes, please make sure you have stopped old etcd instances and removed their old data directories specified by the data-dir configuration parameter.
### Client Request Timeout

View File

@ -233,10 +233,11 @@ curl http://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -d ttl= -d prevExist=t
### Refreshing key TTL
Keys in etcd can be refreshed without notifying watchers
this can be achieved by setting the refresh to true when updating a TTL
Keys in etcd can be refreshed without notifying current watchers.
You cannot update the value of a key when refreshing it
This can be achieved by setting the refresh to true when updating a TTL.
You cannot update the value of a key when refreshing it.
```sh
curl http://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -d ttl=5
@ -558,6 +559,25 @@ Let's create a key-value pair first: `foo=one`.
curl http://127.0.0.1:2379/v2/keys/foo -XPUT -d value=one
```
```json
{
"action":"set",
"node":{
"key":"/foo",
"value":"one",
"modifiedIndex":4,
"createdIndex":4
}
}
```
Specifying `noValueOnSuccess` option skips returning the node as value.
```sh
curl http://127.0.0.1:2379/v2/keys/foo?noValueOnSuccess=true -XPUT -d value=one
# {"action":"set"}
```
Now let's try some invalid `CompareAndSwap` commands.
Trying to set this existing key with `prevExist=false` fails as expected:

View File

@ -18,7 +18,7 @@ A keys lifetime spans a generation. Each key may have one or multiple generat
### Physical View
etcd stores the physical data as key-value pairs in a persistent [b+tree][b+tree]. Each revision of the stores state only contains the delta from its previous revision to be efficient. A single revision may correspond to multiple keys in the tree.
etcd stores the physical data as key-value pairs in a persistent [b+tree][b+tree]. Each revision of the stores state only contains the delta from its previous revision to be efficient. A single revision may correspond to multiple keys in the tree.
The key of key-value pair is a 3-tuple (major, sub, type). Major is the store revision holding the key. Sub differentiates among keys within the same revision. Type is an optional suffix for special value (e.g., `t` if the value contains a tombstone). The value of the key-value pair contains the modification from previous revision, thus one delta from previous revision. The b+tree is ordered by key in lexical byte-order. Ranged lookups over revision deltas are fast; this enables quickly finding modifications from one specific revision to another. Compaction removes out-of-date keys-value pairs.
@ -47,7 +47,7 @@ An etcd operation is considered complete when it is committed through consensus,
#### revision
An etcd operation that modifies the key value store is assigned with a single increasing revision. A transaction operation might modifies the key value store multiple times, but only one revision is assigned. The revision attribute of a key value pair that modified by the operation has the same value as the revision of the operation. The revision can be used as a logical clock for key value store. A key value pair that has a larger revision is modified after a key value pair with a smaller revision. Two key value pairs that have the same revision are modified by an operation "concurrently".
An etcd operation that modifies the key value store is assigned with a single increasing revision. A transaction operation might modify the key value store multiple times, but only one revision is assigned. The revision attribute of a key value pair that modified by the operation has the same value as the revision of the operation. The revision can be used as a logical clock for key value store. A key value pair that has a larger revision is modified after a key value pair with a smaller revision. Two key value pairs that have the same revision are modified by an operation "concurrently".
### Guarantees Provided
@ -73,7 +73,7 @@ Any completed operations are durable. All accessible data is also durable data.
#### Linearizability
Linearizability (also known as Atomic Consistency or External Consistency) is a consistency level between strict consistency and sequential consistency.
Linearizability (also known as Atomic Consistency or External Consistency) is a consistency level between strict consistency and sequential consistency.
For linearizability, suppose each operation receives a timestamp from a loosely synchronized global clock. Operations are linearized if and only if they always complete as though they were executed in a sequential order and each operation appears to complete in the order specified by the program. Likewise, if an operations timestamp precedes another, that operation must also precede the other operation in the sequence.
@ -83,10 +83,10 @@ etcd does not ensure linearizability for watch operations. Users are expected to
etcd ensures linearizability for all other operations by default. Linearizability comes with a cost, however, because linearized requests must go through the Raft consensus process. To obtain lower latencies and higher throughput for read requests, clients can configure a requests consistency mode to `serializable`, which may access stale data with respect to quorum, but removes the performance penalty of linearized accesses' reliance on live consensus.
[persistent-ds]: [https://en.wikipedia.org/wiki/Persistent_data_structure]
[btree]: [https://en.wikipedia.org/wiki/B-tree]
[b+tree]: [https://en.wikipedia.org/wiki/B%2B_tree]
[seq_consistency]: [https://en.wikipedia.org/wiki/Consistency_model#Sequential_consistency]
[strict_consistency]: [https://en.wikipedia.org/wiki/Consistency_model#Strict_consistency]
[serializable_isolation]: [https://en.wikipedia.org/wiki/Isolation_(database_systems)#Serializable]
[Linearizability]: [#Linearizability]
[persistent-ds]: https://en.wikipedia.org/wiki/Persistent_data_structure
[btree]: https://en.wikipedia.org/wiki/B-tree
[b+tree]: https://en.wikipedia.org/wiki/B%2B_tree
[seq_consistency]: https://en.wikipedia.org/wiki/Consistency_model#Sequential_consistency
[strict_consistency]: https://en.wikipedia.org/wiki/Consistency_model#Strict_consistency
[serializable_isolation]: https://en.wikipedia.org/wiki/Isolation_(database_systems)#Serializable
[Linearizability]: #linearizability

View File

@ -145,8 +145,8 @@ GET/HEAD /v2/auth/users
"role": "root",
"permissions": {
"kv": {
"read": ["*"],
"write": ["*"]
"read": ["/*"],
"write": ["/*"]
}
}
}
@ -159,8 +159,8 @@ GET/HEAD /v2/auth/users
"role": "guest",
"permissions": {
"kv": {
"read": ["*"],
"write": ["*"]
"read": ["/*"],
"write": ["/*"]
}
}
}
@ -198,8 +198,8 @@ GET/HEAD /v2/auth/users/alice
"role": "etcd",
"permissions" : {
"kv" : {
"read": [ "*" ],
"write": [ "*" ]
"read": [ "/*" ],
"write": [ "/*" ]
}
}
}
@ -311,8 +311,8 @@ GET/HEAD /v2/auth/roles
"role": "etcd",
"permissions": {
"kv": {
"read": ["*"],
"write": ["*"]
"read": ["/*"],
"write": ["/*"]
}
}
},
@ -320,8 +320,8 @@ GET/HEAD /v2/auth/roles
"role": "quay",
"permissions": {
"kv": {
"read": ["*"],
"write": ["*"]
"read": ["/*"],
"write": ["/*"]
}
}
}
@ -393,7 +393,7 @@ PUT /v2/auth/roles/guest
"revoke" : {
"kv" : {
"write": [
"*"
"/*"
]
}
}

View File

@ -18,7 +18,7 @@ The major flag changes are to mostly related to bootstrapping. The `initial-*` f
- `-peer-election-timeout` is replaced by `-election-timeout`.
The documentation of new command line flags can be found at
https://github.com/coreos/etcd/blob/master/Documentation/configuration.md.
https://github.com/coreos/etcd/blob/master/Documentation/v2/configuration.md.
## Data Directory Naming
@ -32,7 +32,7 @@ The consistent flag for read operations is removed in etcd 2.0.0. The normal rea
The read consistency guarantees are:
The consistent read guarantees the sequential consistency within one client that talks to one etcd server. Read/Write from one client to one etcd member should be observed in order. If one client write a value to an etcd server successfully, it should be able to get the value out of the server immediately.
The consistent read guarantees the sequential consistency within one client that talks to one etcd server. Read/Write from one client to one etcd member should be observed in order. If one client write a value to an etcd server successfully, it should be able to get the value out of the server immediately.
Each etcd member will proxy the request to leader and only return the result to user after the result is applied on the local member. Thus after the write succeed, the user is guaranteed to see the value on the member it sent the request to.
@ -56,6 +56,7 @@ Proxy mode in 2.0 will provide similar functionality, and with improved control
## Discovery Service
A size key needs to be provided inside a [discovery token][discoverytoken].
[discoverytoken]: clustering.md#custom-etcd-discovery-service
## HTTP Admin API

View File

@ -0,0 +1,18 @@
# Benchmarks
etcd benchmarks will be published regularly and tracked for each release below:
- [etcd v2.1.0-alpha][2.1]
- [etcd v2.2.0-rc][2.2]
- [etcd v3 demo][3.0]
# Memory Usage Benchmarks
It records expected memory usage in different scenarios.
- [etcd v2.2.0-rc][2.2-mem]
[2.1]: etcd-2-1-0-alpha-benchmarks.md
[2.2]: etcd-2-2-0-rc-benchmarks.md
[2.2-mem]: etcd-2-2-0-rc-memory-benchmarks.md
[3.0]: etcd-3-demo-benchmarks.md

View File

@ -0,0 +1,52 @@
## Physical machines
GCE n1-highcpu-2 machine type
- 1x dedicated local SSD mounted under /var/lib/etcd
- 1x dedicated slow disk for the OS
- 1.8 GB memory
- 2x CPUs
- etcd version 2.1.0 alpha
## etcd Cluster
3 etcd members, each runs on a single machine
## Testing
Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
## Performance
### reading one single key
| key size in bytes | number of clients | target etcd server | read QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|--------------------|----------|---------------|
| 64 | 1 | leader only | 1534 | 0.7 |
| 64 | 64 | leader only | 10125 | 9.1 |
| 64 | 256 | leader only | 13892 | 27.1 |
| 256 | 1 | leader only | 1530 | 0.8 |
| 256 | 64 | leader only | 10106 | 10.1 |
| 256 | 256 | leader only | 14667 | 27.0 |
| 64 | 64 | all servers | 24200 | 3.9 |
| 64 | 256 | all servers | 33300 | 11.8 |
| 256 | 64 | all servers | 24800 | 3.9 |
| 256 | 256 | all servers | 33000 | 11.5 |
### writing one single key
| key size in bytes | number of clients | target etcd server | write QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|--------------------|-----------|---------------|
| 64 | 1 | leader only | 60 | 21.4 |
| 64 | 64 | leader only | 1742 | 46.8 |
| 64 | 256 | leader only | 3982 | 90.5 |
| 256 | 1 | leader only | 58 | 20.3 |
| 256 | 64 | leader only | 1770 | 47.8 |
| 256 | 256 | leader only | 4157 | 105.3 |
| 64 | 64 | all servers | 1028 | 123.4 |
| 64 | 256 | all servers | 3260 | 123.8 |
| 256 | 64 | all servers | 1033 | 121.5 |
| 256 | 256 | all servers | 3061 | 119.3 |
[boom]: https://github.com/rakyll/boom
[hack-benchmark]: ../../../hack/benchmark/

View File

@ -0,0 +1,72 @@
# Benchmarking etcd v2.2.0
## Physical Machines
GCE n1-highcpu-2 machine type
- 1x dedicated local SSD mounted as etcd data directory
- 1x dedicated slow disk for the OS
- 1.8 GB memory
- 2x CPUs
## etcd Cluster
3 etcd 2.2.0 members, each runs on a single machine.
Detailed versions:
```
etcd Version: 2.2.0
Git SHA: e4561dd
Go Version: go1.5
Go OS/Arch: linux/amd64
```
## Testing
Bootstrap another machine, outside of the etcd cluster, and run the [`boom` HTTP benchmark tool][boom] with a connection reuse patch to send requests to each etcd cluster member. See the [benchmark instructions][hack] for the patch and the steps to reproduce our procedures.
The performance is calulated through results of 100 benchmark rounds.
## Performance
### Single Key Read Performance
| key size in bytes | number of clients | target etcd server | average read QPS | read QPS stddev | average 90th Percentile Latency (ms) | latency stddev |
|-------------------|-------------------|--------------------|------------------|-----------------|--------------------------------------|----------------|
| 64 | 1 | leader only | 2303 | 200 | 0.49 | 0.06 |
| 64 | 64 | leader only | 15048 | 685 | 7.60 | 0.46 |
| 64 | 256 | leader only | 14508 | 434 | 29.76 | 1.05 |
| 256 | 1 | leader only | 2162 | 214 | 0.52 | 0.06 |
| 256 | 64 | leader only | 14789 | 792 | 7.69| 0.48 |
| 256 | 256 | leader only | 14424 | 512 | 29.92 | 1.42 |
| 64 | 64 | all servers | 45752 | 2048 | 2.47 | 0.14 |
| 64 | 256 | all servers | 46592 | 1273 | 10.14 | 0.59 |
| 256 | 64 | all servers | 45332 | 1847 | 2.48| 0.12 |
| 256 | 256 | all servers | 46485 | 1340 | 10.18 | 0.74 |
### Single Key Write Performance
| key size in bytes | number of clients | target etcd server | average write QPS | write QPS stddev | average 90th Percentile Latency (ms) | latency stddev |
|-------------------|-------------------|--------------------|------------------|-----------------|--------------------------------------|----------------|
| 64 | 1 | leader only | 55 | 4 | 24.51 | 13.26 |
| 64 | 64 | leader only | 2139 | 125 | 35.23 | 3.40 |
| 64 | 256 | leader only | 4581 | 581 | 70.53 | 10.22 |
| 256 | 1 | leader only | 56 | 4 | 22.37| 4.33 |
| 256 | 64 | leader only | 2052 | 151 | 36.83 | 4.20 |
| 256 | 256 | leader only | 4442 | 560 | 71.59 | 10.03 |
| 64 | 64 | all servers | 1625 | 85 | 58.51 | 5.14 |
| 64 | 256 | all servers | 4461 | 298 | 89.47 | 36.48 |
| 256 | 64 | all servers | 1599 | 94 | 60.11| 6.43 |
| 256 | 256 | all servers | 4315 | 193 | 88.98 | 7.01 |
## Performance Changes
- Because etcd now records metrics for each API call, read QPS performance seems to see a minor decrease in most scenarios. This minimal performance impact was judged a reasonable investment for the breadth of monitoring and debugging information returned.
- Write QPS to cluster leaders seems to be increased by a small margin. This is because the main loop and entry apply loops were decoupled in the etcd raft logic, eliminating several blocks between them.
- Write QPS to all members seems to be increased by a significant margin, because followers now receive the latest commit index sooner, and commit proposals more quickly.
[boom]: https://github.com/rakyll/boom
[hack]: ../../../hack/benchmark/

View File

@ -0,0 +1,72 @@
## Physical machines
GCE n1-highcpu-2 machine type
- 1x dedicated local SSD mounted under /var/lib/etcd
- 1x dedicated slow disk for the OS
- 1.8 GB memory
- 2x CPUs
## etcd Cluster
3 etcd 2.2.0-rc members, each runs on a single machine.
Detailed versions:
```
etcd Version: 2.2.0-alpha.1+git
Git SHA: 59a5a7e
Go Version: go1.4.2
Go OS/Arch: linux/amd64
```
Also, we use 3 etcd 2.1.0 alpha-stage members to form cluster to get base performance. etcd's commit head is at [c7146bd5][c7146bd5], which is the same as the one that we use in [etcd 2.1 benchmark][etcd-2.1-benchmark].
## Testing
Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
## Performance
### reading one single key
| key size in bytes | number of clients | target etcd server | read QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|--------------------|----------|---------------|
| 64 | 1 | leader only | 2804 (-5%) | 0.4 (+0%) |
| 64 | 64 | leader only | 17816 (+0%) | 5.7 (-6%) |
| 64 | 256 | leader only | 18667 (-6%) | 20.4 (+2%) |
| 256 | 1 | leader only | 2181 (-15%) | 0.5 (+25%) |
| 256 | 64 | leader only | 17435 (-7%) | 6.0 (+9%) |
| 256 | 256 | leader only | 18180 (-8%) | 21.3 (+3%) |
| 64 | 64 | all servers | 46965 (-4%) | 2.1 (+0%) |
| 64 | 256 | all servers | 55286 (-6%) | 7.4 (+6%) |
| 256 | 64 | all servers | 46603 (-6%) | 2.1 (+5%) |
| 256 | 256 | all servers | 55291 (-6%) | 7.3 (+4%) |
### writing one single key
| key size in bytes | number of clients | target etcd server | write QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|--------------------|-----------|---------------|
| 64 | 1 | leader only | 76 (+22%) | 19.4 (-15%) |
| 64 | 64 | leader only | 2461 (+45%) | 31.8 (-32%) |
| 64 | 256 | leader only | 4275 (+1%) | 69.6 (-10%) |
| 256 | 1 | leader only | 64 (+20%) | 16.7 (-30%) |
| 256 | 64 | leader only | 2385 (+30%) | 31.5 (-19%) |
| 256 | 256 | leader only | 4353 (-3%) | 74.0 (+9%) |
| 64 | 64 | all servers | 2005 (+81%) | 49.8 (-55%) |
| 64 | 256 | all servers | 4868 (+35%) | 81.5 (-40%) |
| 256 | 64 | all servers | 1925 (+72%) | 47.7 (-59%) |
| 256 | 256 | all servers | 4975 (+36%) | 70.3 (-36%) |
### performance changes explanation
- read QPS in most scenarios is decreased by 5~8%. The reason is that etcd records store metrics for each store operation. The metrics is important for monitoring and debugging, so this is acceptable.
- write QPS to leader is increased by 20~30%. This is because we decouple raft main loop and entry apply loop, which avoids them blocking each other.
- write QPS to all servers is increased by 30~80% because follower could receive latest commit index earlier and commit proposals faster.
[boom]: https://github.com/rakyll/boom
[c7146bd5]: https://github.com/coreos/etcd/commits/c7146bd5f2c73716091262edc638401bb8229144
[etcd-2.1-benchmark]: etcd-2-1-0-alpha-benchmarks.md
[hack-benchmark]: ../../../hack/benchmark/

View File

@ -0,0 +1,47 @@
## Physical machine
GCE n1-standard-2 machine type
- 1x dedicated local SSD mounted under /var/lib/etcd
- 1x dedicated slow disk for the OS
- 7.5 GB memory
- 2x CPUs
## etcd
```
etcd Version: 2.2.0-rc.0+git
Git SHA: 103cb5c
Go Version: go1.5
Go OS/Arch: linux/amd64
```
## Testing
Start 3-member etcd cluster, each of which uses 2 cores.
The length of key name is always 64 bytes, which is a reasonable length of average key bytes.
## Memory Maximal Usage
- etcd may use maximal memory if one follower is dead and the leader keeps sending snapshots.
- `max RSS` is the maximal memory usage recorded in 3 runs.
| value bytes | key number | data size(MB) | max RSS(MB) | max RSS/data rate on leader |
|-------------|-------------|---------------|-------------|-----------------------------|
| 128 | 50000 | 6 | 433 | 72x |
| 128 | 100000 | 12 | 659 | 54x |
| 128 | 200000 | 24 | 1466 | 61x |
| 1024 | 50000 | 48 | 1253 | 26x |
| 1024 | 100000 | 96 | 2344 | 24x |
| 1024 | 200000 | 192 | 4361 | 22x |
## Data Size Threshold
- When etcd reaches data size threshold, it may trigger leader election easily and drop part of proposals.
- At most cases, etcd cluster should work smoothly if it doesn't hit the threshold. If it doesn't work well due to insufficient resources, you need to decrease its data size.
| value bytes | key number limitation | suggested data size threshold(MB) | consumed RSS(MB) |
|-------------|-----------------------|-----------------------------------|------------------|
| 128 | 400K | 48 | 2400 |
| 1024 | 300K | 292 | 6500 |

View File

@ -0,0 +1,42 @@
## Physical machines
GCE n1-highcpu-2 machine type
- 1x dedicated local SSD mounted under /var/lib/etcd
- 1x dedicated slow disk for the OS
- 1.8 GB memory
- 2x CPUs
- etcd version 2.2.0
## etcd Cluster
1 etcd member running in v3 demo mode
## Testing
Use [etcd v3 benchmark tool][etcd-v3-benchmark].
## Performance
### reading one single key
| key size in bytes | number of clients | read QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|----------|---------------|
| 256 | 1 | 2716 | 0.4 |
| 256 | 64 | 16623 | 6.1 |
| 256 | 256 | 16622 | 21.7 |
The performance is nearly the same as the one with empty server handler.
### reading one single key after putting
| key size in bytes | number of clients | read QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|----------|---------------|
| 256 | 1 | 2269 | 0.5 |
| 256 | 64 | 13582 | 8.6 |
| 256 | 256 | 13262 | 47.5 |
The performance with empty server handler is not affected by one put. So the
performance downgrade should be caused by storage package.
[etcd-v3-benchmark]: ../../../tools/benchmark/

View File

@ -0,0 +1,77 @@
# Watch Memory Usage Benchmark
*NOTE*: The watch features are under active development, and their memory usage may change as that development progresses. We do not expect it to significantly increase beyond the figures stated below.
A primary goal of etcd is supporting a very large number of watchers doing a massively large amount of watching. etcd aims to support O(10k) clients, O(100K) watch streams (O(10) streams per client) and O(10M) total watchings (O(100) watching per stream). The memory consumed by each individual watching accounts for the largest portion of etcd's overall usage, and is therefore the focus of current and future optimizations.
Three related components of etcd watch consume physical memory: each `grpc.Conn`, each watch stream, and each instance of the watching activity. `grpc.Conn` maintains the actual TCP connection and other gRPC connection state. Each `grpc.Conn` consumes O(10kb) of memory, and might have multiple watch streams attached.
Each watch stream is an independent HTTP2 connection which consumes another O(10kb) of memory.
Multiple watchings might share one watch stream.
Watching is the actual struct that tracks the changes on the key-value store. Each watching should only consume < O(1kb).
```
+-------+
| watch |
+---------> | foo |
| +-------+
+------+-----+
| stream |
+--------------> | |
| +------+-----+ +-------+
| | | watch |
| +---------> | bar |
+-----+------+ +-------+
| | +------------+
| conn +-------> | stream |
| | | |
+-----+------+ +------------+
|
|
|
| +------------+
+--------------> | stream |
| |
+------------+
```
The theoretical memory consumption of watch can be approximated with the formula:
`memory = c1 * number_of_conn + c2 * avg_number_of_stream_per_conn + c3 * avg_number_of_watch_stream`
## Testing Environment
etcd version
- git head https://github.com/coreos/etcd/commit/185097ffaa627b909007e772c175e8fefac17af3
GCE n1-standard-2 machine type
- 7.5 GB memory
- 2x CPUs
## Overall memory usage
The overall memory usage captures how much [RSS][rss] etcd consumes with the client watchers. While the result may vary by as much as 10%, it is still meaningful, since the goal is to learn about the rough memory usage and the pattern of allocations.
With the benchmark result, we can calculate roughly that `c1 = 17kb`, `c2 = 18kb` and `c3 = 350bytes`. So each additional client connection consumes 17kb of memory and each additional stream consumes 18kb of memory, and each additional watching only cause 350bytes. A single etcd server can maintain millions of watchings with a few GB of memory in normal case.
| clients | streams per client | watchings per stream | total watching | memory usage |
|---------|---------|-----------|----------------|--------------|
| 1k | 1 | 1 | 1k | 50MB |
| 2k | 1 | 1 | 2k | 90MB |
| 5k | 1 | 1 | 5k | 200MB |
| 1k | 10 | 1 | 10k | 217MB |
| 2k | 10 | 1 | 20k | 417MB |
| 5k | 10 | 1 | 50k | 980MB |
| 1k | 50 | 1 | 50k | 1001MB |
| 2k | 50 | 1 | 100k | 1960MB |
| 5k | 50 | 1 | 250k | 4700MB |
| 1k | 50 | 10 | 500k | 1171MB |
| 2k | 50 | 10 | 1M | 2371MB |
| 5k | 50 | 10 | 2.5M | 5710MB |
| 1k | 50 | 100 | 5M | 2380MB |
| 2k | 50 | 100 | 10M | 4672MB |
| 5k | 50 | 100 | 50M | *OOM* |
[rss]: https://en.wikipedia.org/wiki/Resident_set_size

View File

@ -0,0 +1,98 @@
# Storage Memory Usage Benchmark
<!---todo: link storage to storage design doc-->
Two components of etcd storage consume physical memory. The etcd process allocates an *in-memory index* to speed key lookup. The process's *page cache*, managed by the operating system, stores recently-accessed data from disk for quick re-use.
The in-memory index holds all the keys in a [B-tree][btree] data structure, along with pointers to the on-disk data (the values). Each key in the B-tree may contain multiple pointers, pointing to different versions of its values. The theoretical memory consumption of the in-memory index can hence be approximated with the formula:
`N * (c1 + avg_key_size) + N * (avg_versions_of_key) * (c2 + size_of_pointer)`
where `c1` is the key metadata overhead and `c2` is the version metadata overhead.
The graph shows the detailed structure of the in-memory index B-tree.
```
In mem index
+------------+
| key || ... |
+--------------+ | || |
| | +------------+
| | | v1 || ... |
| disk <----------------| || | Tree Node
| | +------------+
| | | v2 || ... |
| <----------------+ || |
| | +------------+
+--------------+ +-----+ | | |
| | | | |
| +------------+
|
|
^
------+
| ... |
| |
+-----+
| ... | Tree Node
| |
+-----+
| ... |
| |
------+
```
[Page cache memory][pagecache] is managed by the operating system and is not covered in detail in this document.
## Testing Environment
etcd version
- git head https://github.com/coreos/etcd/commit/776e9fb7be7eee5e6b58ab977c8887b4fe4d48db
GCE n1-standard-2 machine type
- 7.5 GB memory
- 2x CPUs
## In-memory index memory usage
In this test, we only benchmark the memory usage of the in-memory index. The goal is to find `c1` and `c2` mentioned above and to understand the hard limit of memory consumption of the storage.
We calculate the memory usage consumption via the Go runtime.ReadMemStats. We calculate the total allocated bytes difference before creating the index and after creating the index. It cannot perfectly reflect the memory usage of the in-memory index itself but can show the rough consumption pattern.
| N | versions | key size | memory usage |
|------|----------|----------|--------------|
| 100K | 1 | 64bytes | 22MB |
| 100K | 5 | 64bytes | 39MB |
| 1M | 1 | 64bytes | 218MB |
| 1M | 5 | 64bytes | 432MB |
| 100K | 1 | 256bytes | 41MB |
| 100K | 5 | 256bytes | 65MB |
| 1M | 1 | 256bytes | 409MB |
| 1M | 5 | 256bytes | 506MB |
Based on the result, we can calculate `c1=120bytes`, `c2=30bytes`. We only need two sets of data to calculate `c1` and `c2`, since they are the only unknown variable in the formula. The `c1=120bytes` and `c2=30bytes` are the average value of the 4 sets of `c1` and `c2` we calculated. The key metadata overhead is still relatively nontrivial (50%) for small key-value pairs. However, this is a significant improvement over the old store, which had at least 1000% overhead.
## Overall memory usage
The overall memory usage captures how much RSS etcd consumes with the storage. The value size should have very little impact on the overall memory usage of etcd, since we keep values on disk and only retain hot values in memory, managed by the OS page cache.
| N | versions | key size | value size | memory usage |
|------|----------|----------|------------|--------------|
| 100K | 1 | 64bytes | 256bytes | 40MB |
| 100K | 5 | 64bytes | 256bytes | 89MB |
| 1M | 1 | 64bytes | 256bytes | 470MB |
| 1M | 5 | 64bytes | 256bytes | 880MB |
| 100K | 1 | 64bytes | 1KB | 102MB |
| 100K | 5 | 64bytes | 1KB | 164MB |
| 1M | 1 | 64bytes | 1KB | 587MB |
| 1M | 5 | 64bytes | 1KB | 836MB |
Based on the result, we know the value size does not significantly impact the memory consumption. There is some minor increase due to more data held in the OS page cache.
[btree]: https://en.wikipedia.org/wiki/B-tree
[pagecache]: https://en.wikipedia.org/wiki/Page_cache

View File

@ -0,0 +1,26 @@
# Branch Management
## Guide
* New development occurs on the [master branch][master].
* Master branch should always have a green build!
* Backwards-compatible bug fixes should target the master branch and subsequently be ported to stable branches.
* Once the master branch is ready for release, it will be tagged and become the new stable branch.
The etcd team has adopted a *rolling release model* and supports one stable version of etcd.
### Master branch
The `master` branch is our development branch. All new features land here first.
If you want to try new features, pull `master` and play with it. Note that `master` may not be stable because new features may introduce bugs.
Before the release of the next stable version, feature PRs will be frozen. We will focus on the testing, bug-fix and documentation for one to two weeks.
### Stable branches
All branches with prefix `release-` are considered _stable_ branches.
After every minor release (http://semver.org/), we will have a new stable branch for that release. We will keep fixing the backwards-compatible bugs for the latest stable release, but not previous releases. The _patch_ release, incorporating any bug fixes, will be once every two weeks, given any patches.
[master]: https://github.com/coreos/etcd/tree/master

View File

@ -309,6 +309,7 @@ infra0.example.com. 300 IN A 10.0.1.10
infra1.example.com. 300 IN A 10.0.1.11
infra2.example.com. 300 IN A 10.0.1.12
```
#### Bootstrap the etcd cluster using DNS
etcd cluster members can listen on domain names or IP address, the bootstrap process will resolve DNS A records.
@ -422,7 +423,7 @@ To make understanding this feature easier, we changed the naming of some flags,
|-peers |none |Deprecated. The --initial-cluster flag provides a similar concept with different semantics. Please read this guide on cluster startup.|
|-peers-file |none |Deprecated. The --initial-cluster flag provides a similar concept with different semantics. Please read this guide on cluster startup.|
[client]: /client
[client]: ../../client
[client-discoverer]: https://godoc.org/github.com/coreos/etcd/client#Discoverer
[conf-adv-client]: configuration.md#-advertise-client-urls
[conf-listen-client]: configuration.md#-listen-client-urls

View File

@ -39,7 +39,7 @@ To start etcd automatically using custom settings at startup in Linux, using a [
+ env variable: ETCD_HEARTBEAT_INTERVAL
### --election-timeout
+ Time (in milliseconds) for an election to timeout. See [Documentation/tuning.md](tuning.md#time-parameters) for details.
+ Time (in milliseconds) for an election to timeout. See [tuning.md](tuning.md#time-parameters) for details.
+ default: "1000"
+ env variable: ETCD_ELECTION_TIMEOUT
@ -176,7 +176,10 @@ To start etcd automatically using custom settings at startup in Linux, using a [
The security flags help to [build a secure etcd cluster][security].
### --ca-file [DEPRECATED]
### --ca-file
**DEPRECATED**
+ Path to the client server TLS CA file. `--ca-file ca.crt` could be replaced by `--trusted-ca-file ca.crt --client-cert-auth` and etcd will perform the same.
+ default: none
+ env variable: ETCD_CA_FILE
@ -201,7 +204,10 @@ The security flags help to [build a secure etcd cluster][security].
+ default: none
+ env variable: ETCD_TRUSTED_CA_FILE
### --peer-ca-file [DEPRECATED]
### --peer-ca-file
**DEPRECATED**
+ Path to the peer server TLS CA file. `--peer-ca-file ca.crt` could be replaced by `--peer-trusted-ca-file ca.crt --peer-client-cert-auth` and etcd will perform the same.
+ default: none
+ env variable: ETCD_PEER_CA_FILE
@ -234,7 +240,7 @@ The security flags help to [build a secure etcd cluster][security].
+ env variable: ETCD_DEBUG
### --log-package-levels
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ default: none (INFO for all packages)
+ env variable: ETCD_LOG_PACKAGE_LEVELS
@ -266,13 +272,13 @@ Follow the instructions when using these flags.
## Profiling flags
### --enable-pprof
+ Enable runtime profiling data via HTTP server. Address is at client URL + "/debug/pprof"
+ Enable runtime profiling data via HTTP server. Address is at client URL + "/debug/pprof/"
+ default: false
[build-cluster]: clustering.md#static
[reconfig]: runtime-configuration.md
[discovery]: clustering.md#discovery
[iana-ports]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=etcd
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
[proxy]: proxy.md
[reconfig]: runtime-configuration.md
[restore]: admin_guide.md#restoring-a-backup

View File

@ -6,11 +6,11 @@ The procedure includes some manual steps for sanity checking but it can probably
## Prepare Release
Set desired version as environment variable for following steps. Here is an example to release 2.3.0:
Set desired version as environment variable for following steps. Here is an example to release 2.1.3:
```
export VERSION=v2.3.0
export PREV_VERSION=v2.2.5
export VERSION=v2.1.3
export PREV_VERSION=v2.1.2
```
All releases version numbers follow the format of [semantic versioning 2.0.0](http://semver.org/).
@ -30,6 +30,7 @@ All releases version numbers follow the format of [semantic versioning 2.0.0](ht
## Write Release Note
- Write introduction for the new release. For example, what major bug we fix, what new features we introduce or what performance improvement we make.
- Write changelog for the last release. ChangeLog should be straightforward and easy to understand for the end-user.
- Put `[GH XXXX]` at the head of change line to reference Pull Request that introduces the change. Moreover, add a link on it to jump to the Pull Request.
@ -47,7 +48,7 @@ All releases version numbers follow the format of [semantic versioning 2.0.0](ht
## Build Release Binaries and Images
- Ensure `actool` is available, or installing it through `go get github.com/appc/spec/actool`.
- Ensure `acbuild` is available.
- Ensure `docker` is available.
Run release script in root directory:
@ -60,14 +61,16 @@ It generates all release binaries and images under directory ./release.
## Sign Binaries and Images
etcd project key must be used to sign the generated binaries and images.`$SUBKEYID` is the key ID of etcd project Yubikey. Connect the key and run `gpg2 --card-status` to get the ID.
Choose appropriate private key to sign the generated binaries and images.
The following commands are used for public release sign:
```
cd release
for i in etcd-*{.zip,.tar.gz}; do gpg2 --default-key $SUBKEYID --output ${i}.asc --detach-sign ${i}; done
for i in etcd-*{.zip,.tar.gz}; do gpg2 --verify ${i}.asc ${i}; done
# personal GPG is okay for now
for i in etcd-*{.zip,.tar.gz}; do gpg --sign ${i}; done
# use `CoreOS ACI Builder <release@coreos.com>` secret key
for aci in etcd-${VERSION}.*.aci; do gpg -u 88182190 -a --output ${aci}.asc --detach-sig ${aci}; done
```
## Publish Release Page in GitHub
@ -85,6 +88,7 @@ for i in etcd-*{.zip,.tar.gz}; do gpg2 --verify ${i}.asc ${i}; done
```
docker login quay.io
docker push quay.io/coreos/etcd:${VERSION}
docker push quay.io/coreos/etcd:${VERSION}-${arch}
```
- Add `latest` tag to the new image on [quay.io](https://quay.io/repository/coreos/etcd?tag=latest&tab=tags) if this is a stable release.

View File

@ -16,7 +16,7 @@ This will run the latest release version of etcd. You can specify version if nee
```
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
--name etcd quay.io/coreos/etcd \
--name etcd quay.io/coreos/etcd:v2.3.8 \
-name etcd0 \
-advertise-client-urls http://${HostIP}:2379,http://${HostIP}:4001 \
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
@ -42,11 +42,13 @@ etcdctl -C http://192.168.12.50:4001 member list
Using Docker to setup a multi-node cluster is very similar to the standalone mode configuration.
The main difference being the value used for the `-initial-cluster` flag, which must contain the peer urls for each etcd member in the cluster.
**Although the following commands look very similar, note that `-name`, `-advertise-client-urls` and `-initial-advertise-peer-urls` differ for each cluster member**
### etcd0
```
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
--name etcd quay.io/coreos/etcd \
--name etcd quay.io/coreos/etcd:v2.3.8 \
-name etcd0 \
-advertise-client-urls http://192.168.12.50:2379,http://192.168.12.50:4001 \
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
@ -61,7 +63,7 @@ docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380
```
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
--name etcd quay.io/coreos/etcd \
--name etcd quay.io/coreos/etcd:v2.3.8 \
-name etcd1 \
-advertise-client-urls http://192.168.12.51:2379,http://192.168.12.51:4001 \
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
@ -76,7 +78,7 @@ docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380
```
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
--name etcd quay.io/coreos/etcd \
--name etcd quay.io/coreos/etcd:v2.3.8 \
-name etcd2 \
-advertise-client-urls http://192.168.12.52:2379,http://192.168.12.52:4001 \
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \

View File

@ -0,0 +1,121 @@
### General cluster availability ###
# alert if another failed member will result in an unavailable cluster
ALERT InsufficientMembers
IF count(up{job="etcd"} == 0) > (count(up{job="etcd"}) / 2 - 1)
FOR 3m
LABELS {
severity = "critical"
}
ANNOTATIONS {
summary = "etcd cluster insufficient members",
description = "If one more etcd member goes down the cluster will be unavailable",
}
### HTTP requests alerts ###
# alert if more than 1% of requests to an HTTP endpoint have failed with a non 4xx response
ALERT HighNumberOfFailedHTTPRequests
IF sum by(method) (rate(etcd_http_failed_total{job="etcd", code!~"4[0-9]{2}"}[5m]))
/ sum by(method) (rate(etcd_http_received_total{job="etcd"}[5m])) > 0.01
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "a high number of HTTP requests are failing",
description = "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}",
}
# alert if more than 5% of requests to an HTTP endpoint have failed with a non 4xx response
ALERT HighNumberOfFailedHTTPRequests
IF sum by(method) (rate(etcd_http_failed_total{job="etcd", code!~"4[0-9]{2}"}[5m]))
/ sum by(method) (rate(etcd_http_received_total{job="etcd"}[5m])) > 0.05
FOR 5m
LABELS {
severity = "critical"
}
ANNOTATIONS {
summary = "a high number of HTTP requests are failing",
description = "{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}",
}
# alert if 50% of requests get a 4xx response
ALERT HighNumberOfFailedHTTPRequests
IF sum by(method) (rate(etcd_http_failed_total{job="etcd", code=~"4[0-9]{2}"}[5m]))
/ sum by(method) (rate(etcd_http_received_total{job="etcd"}[5m])) > 0.5
FOR 10m
LABELS {
severity = "critical"
}
ANNOTATIONS {
summary = "a high number of HTTP requests are failing",
description = "{{ $value }}% of requests for {{ $labels.method }} failed with 4xx responses on etcd instance {{ $labels.instance }}",
}
# alert if the 99th percentile of HTTP requests take more than 150ms
ALERT HTTPRequestsSlow
IF histogram_quantile(0.99, rate(etcd_http_successful_duration_second_bucket[5m])) > 0.15
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "slow HTTP requests",
description = "on etcd instance {{ $labels.instance }} HTTP requests to {{ $label.method }} are slow",
}
### File descriptor alerts ###
instance:fd_utilization = process_open_fds / process_max_fds
# alert if file descriptors are likely to exhaust within the next 4 hours
ALERT FdExhaustionClose
IF predict_linear(instance:fd_utilization[1h], 3600 * 4) > 1
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "file descriptors soon exhausted",
description = "{{ $labels.job }} instance {{ $labels.instance }} will exhaust its file descriptors soon",
}
# alert if file descriptors are likely to exhaust within the next hour
ALERT FdExhaustionClose
IF predict_linear(instance:fd_utilization[10m], 3600) > 1
FOR 10m
LABELS {
severity = "critical"
}
ANNOTATIONS {
summary = "file descriptors soon exhausted",
description = "{{ $labels.job }} instance {{ $labels.instance }} will exhaust its file descriptors soon",
}
### etcd proposal alerts ###
# alert if there are several failed proposals within an hour
ALERT HighNumberOfFailedProposals
IF increase(etcd_server_proposal_failed_total{job="etcd"}[1h]) > 5
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "a high number of proposals within the etcd cluster are failing",
description = "etcd instance {{ $labels.instance }} has seen {{ $value }} proposal failures within the last hour",
}
### etcd disk io latency alerts ###
# alert if 99th percentile of fsync durations is higher than 500ms
ALERT HighFsyncDurations
IF histogram_quantile(0.99, rate(etcd_wal_fsync_durations_seconds_bucket[5m])) > 0.5
FOR 10m
LABELS {
severity = "warning"
}
ANNOTATIONS {
summary = "high fsync durations",
description = "etcd instance {{ $labels.instance }} fync durations are high",
}

84
Documentation/v2/faq.md Normal file
View File

@ -0,0 +1,84 @@
# FAQ
## 1) Why can an etcd client read an old version of data when a majority of the etcd cluster members are down?
In situations where a client connects to a minority, etcd
favors by default availability over consistency. This means that even though
data might be “out of date”, it is still better to return something versus
nothing.
In order to confirm that a read is up to date with a majority of the cluster,
the client can use the `quorum=true` parameter on reads of keys. This means
that a majority of the cluster is checked on reads before returning the data,
otherwise the read will timeout and fail.
## 2) With quorum=false, doesnt this mean that if my client switched the member it was connected to, that it could experience a logical ordering where the cluster goes backwards in time?
Yes, but this could be handled at the etcd client implementation via
remembering the last seen index. The “index” is the cluster's single
irrevocable sequence of the entire modification history. The client could
remember the last seen index, and determine via comparing the index returned on
the GET whether or not the state of the key-value pair is before or after its
last seen state.
## 3) What happens if a watch is registered on a minority member?
The watch will stay untriggered, even as modifications are occurring in the
majority quorum. This is an open issue, and is being addressed in v3. There are
multiple ways to work around the watch trigger not firing.
1) build a signaling mechanism independent of etcd. This could be as simple as
a “pulse” to the client to reissue a GET with quorum=true for the most recent
version of the data.
2) poll on the `/v2/keys` endpoint and check that the raft-index is increasing every
timeout.
## 4) What is a proxy used for?
A proxy is a redirection server to the etcd cluster. The proxy handles the
redirection of a client to the current configuration of the etcd cluster. A
typical use case is to start a proxy on a machine, and on first boot up of the
proxy specify both the `--proxy` flag and the `--initial-cluster` flag.
From there, any etcdctl client that starts up automatically speaks to the local
proxy and the proxy redirects operations to the current configuration of the
cluster it was originally paired with.
In the v2 spec of etcd, proxies cannot be promoted to members of the cluster.
They also cannot be promoted to followers or at any point become part of the
replication of the etcd cluster itself.
## 5) How is cluster membership and health handled in etcd v2?
The design goal of etcd is that reconfiguration is simply an API, and health
monitoring and addition/removal of members is up to the individual application
and their integration with the reconfiguration API.
Thus, a member that is down, even infinitely, will never be automatically
removed from the etcd cluster member list.
This makes sense because it's usually an application level / administrative
action to determine whether a reconfiguration should happen based on health.
For more information, refer to the [runtime reconfiguration design document][runtime-reconf-design].
## 6) how does --endpoint work with etcdctl?
The `--endpoint` flag can specify any number of etcd cluster members in a comma
separated list. This list might be a subset, equal to, or more than the actual
etcd cluster member list itself.
If only one peer is specified via the `--endpoint` flag, the etcdctl discovers the
rest of the cluster via the member list of that one peer, and then it randomly
chooses a member to use. Again, the client can use the `quorum=true` flag on
reads, which will always fail when using a member in the minority.
If peers from multiple clusters are specified via the `--endpoint` flag, etcdctl
will randomly choose a peer, and the request will simply get routed to one of
the clusters. This is probably not what you want.
Note: --peers flag is now deprecated and --endpoint should be used instead,
as it might confuse users to give etcdctl a peerURL.
[runtime-reconf-design]: runtime-reconf-design.md

Some files were not shown because too many files have changed in this diff Show More