Compare commits

...

963 Commits

Author SHA1 Message Date
bbe86b066c version: 3.4.2 2019-10-09 15:26:52 -07:00
2c36cab87d Merge pull request #11223 from YoyinZyc/automated-cherry-pick-of-#11179-origin-release-3.4
Automated cherry pick of #11179
2019-10-09 13:28:04 -07:00
480d5510f9 etcdserver: trace compaction request; add return parameter 'trace' to applierV3.Compaction() mvcc: trace compaction request; add input parameter 'trace' to KV.Compact() 2019-10-09 12:40:12 -07:00
9245518363 etcdserver: trace raft requests. 2019-10-09 12:40:12 -07:00
daa432cfa7 etcdserver: add put request steps. mvcc: add put request steps; add trace to KV.Write() as input parameter. 2019-10-09 12:40:12 -07:00
8717327697 pkg: use zap logger to format the structure log output. 2019-10-09 12:40:12 -07:00
4f1bbff888 pkg: add field to record additional detail of trace; add stepThreshold to reduce log volume. 2019-10-09 12:40:12 -07:00
28bb8037d9 pkg: create package traceutil for tracing. mvcc: add tracing steps:range from the in-memory index tree; range from boltdb. etcdserver: add tracing steps: agreement among raft nodes before linerized reading; authentication; filter and sort kv pairs; assemble the response. 2019-10-09 12:40:12 -07:00
03b5e7229b Merge pull request #11213 from jpbetz/automated-cherry-pick-of-#11211-origin-release-3.4
Automated cherry pick of #11211
2019-10-08 18:47:06 -07:00
0781c0327d clientv3: Replace endpoint.ParseHostPort with net.SplitHostPort to fix IPv6 client endpoints 2019-10-08 18:27:03 -07:00
99774d8ed4 Merge pull request #11214 from jpbetz/automated-cherry-pick-of-#11184-origin-release-3.4
Automated cherry pick of #11184
2019-10-08 17:35:02 -07:00
c454344f14 clientv3: Set authority used in cert checks to host of endpoint 2019-10-08 15:35:27 -07:00
dae0a72a42 Merge pull request #11200 from jingyih/automated-cherry-pick-of-#11194-origin-release-3.4
Automated cherry pick of #11194 on release-3.4
2019-10-03 16:03:23 -07:00
c91a6bf14f tests/e2e: fix metrics tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-10-03 16:02:39 -07:00
b7ff97f54e etcdctl: fix member add command 2019-10-03 13:52:22 -07:00
d08bb07d6d scripts/build-binary: fix darwin tar commands
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-28 11:39:04 -07:00
3a736a81e8 scripts/release: fix SHA256SUMS command
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-17 14:12:18 -07:00
a14579fbfb version: 3.4.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-17 13:53:25 -07:00
ade66a5722 scripts/release: fix docker push command
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-17 13:53:12 -07:00
67cc70926d integration: fix bug in for loop, make it break properly 2019-09-17 13:30:12 -07:00
21dcadc83c Merge pull request #11148 from spzala/automated-cherry-pick-of-#11147-upstream-release-3.4
Automated cherry pick of #11147
2019-09-13 11:12:41 -07:00
c7c379e52e embed: expose ZapLoggerBuilder
This exposes the ZapLoggerBuilder in the embed.Config to allow for
custom loggers to be defined and used by embedded etcd.

Fixes #11144
2019-09-13 14:09:54 -04:00
9ed5f76dc0 vendor: upgrade to gRPC v1.23.1
https://github.com/grpc/grpc-go/releases/tag/v1.23.1

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-11 14:54:24 -07:00
994865c89e Merge pull request #11133 from jingyih/automated-cherry-pick-of-#11126-origin-release-3.4
Automated cherry pick of #11126 on release-3.4
2019-09-07 00:03:37 -07:00
ccbbb2f8d6 mvcc: add store revision metrics
Add experimental metrics etcd_debugging_mvcc_current_revision and
etcd_debugging_mvcc_compact_revision.
2019-09-06 17:03:21 -07:00
d5f79adc9c etcdserver: remove dup percentage sign in log 2019-09-04 22:03:49 -07:00
8b053b0f44 embed: fix secure server logging message
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-03 09:43:08 -07:00
11980f8165 scripts/release: Apply shellcheck findings
I run https://github.com/koalaman/shellcheck/ over scripts/* and fixed
the findings it returned.

Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2019-09-03 09:42:35 -07:00
41d4e2b276 scripts/release: rename SHA256SUM to SHA256SUMS
These files are commonly called SHA256SUMS and with this change rget
works for v3.4.0 as well.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-30 13:35:40 -07:00
898bd1351f version: 3.4.0
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-30 08:09:55 -07:00
d04d96c9ac tests/e2e: run metrics test again
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-30 08:09:32 -07:00
21edf98fdb Documentation:fix clerical error 2019-08-30 08:08:47 -07:00
a4f7c65ef8 vendor: x/sys and x/net to support building on Risc-V
Signed-off-by: Carlos de Paula <me@carlosedp.com>
2019-08-29 14:03:59 -07:00
c3a9eec843 scripts/release: fix sha256sum
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-29 09:38:57 -07:00
e5528acf57 version: 3.4.0-rc.4
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-29 08:53:10 -07:00
9977550ae9 Merge pull request #11091 from hexfusion/automated-cherry-pick-of-#11087-upstream-release-3.4
Automated cherry pick of #11087 on release 3.4
2019-08-29 08:39:11 -07:00
4d7a6e2755 scripts/release: add sha256sum summary of release assets
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-08-29 11:33:16 +00:00
5e8757c3c5 Documentation: Add section headers to etcd Learner
In the Background section, the document describes various challenges for cluster membership change.
Added section header for each case described for better readability.
2019-08-27 10:18:34 -07:00
012e38fef3 version: 3.4.0-rc.3
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-27 09:50:54 -07:00
41a2cfa122 pkg/logutil: change to "MergeOutputPaths"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-27 09:50:26 -07:00
9f8a1edf38 embed: fix "--log-outputs" setup without "stderr"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-27 09:50:17 -07:00
165ba72593 raft/log_test: fixed wrong index 2019-08-26 12:37:07 -07:00
9c850ccef0 raft: fixed some typos and simplify minor logic 2019-08-26 12:37:02 -07:00
61d6efda4c etcdserver: add check for nil options 2019-08-26 10:48:20 -07:00
b76f149c35 tests/e2e: skip metrics tests for now
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-26 00:02:48 -07:00
5e33bb1a95 Documentation: snapshot can be requested from one etcd node only
Updated Snapshot section of demo.md to reflect that snapsot can be requested only from one etcd node at a time.

Fixes : #10855
2019-08-25 23:40:25 -07:00
83bf125d93 clientv3: add nil checks in Close()
Added nil checks in Close() for Watcher and Lease fields
Added test case
2019-08-25 23:40:05 -07:00
d23af41bca tests/e2e: remove string replace for v3.4.0-rc.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-23 01:14:42 -07:00
67d0c21bb0 version: 3.4.0-rc.2
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-23 00:37:01 -07:00
18a077d3d3 raft : Write compact if statements 2019-08-23 00:36:44 -07:00
fb6d870e89 Merge pull request #11072 from jingyih/automated-cherry-pick-of-#11069-origin-release-3.4
Automated cherry pick of #11069 on release-3.4
2019-08-23 06:57:12 +08:00
e00224f87e integration: fix TestKVPutError
Give backend quota enough overhead.
2019-08-22 13:33:19 -07:00
2af1caf1a5 functional test: fix typo in agent log
Fix typo in functional test agent log to avoid debugging confusion.
2019-08-20 15:23:13 -07:00
0777eab766 Documentation/upgrades: special upgrade guides for >= 3.3.14
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-16 16:19:22 -07:00
0ecc0d0542 etcdmain: update help message
Add experimental-peer-skip-client-san-verification flag description to
help message. Add default values.
2019-08-16 16:07:06 -07:00
982a8c9bc3 rafttest: print Ready before processing it
It was confusing to see the effects of the Ready (i.e. log messages)
printed before the Ready itself.
2019-08-16 08:10:17 -07:00
b8e3e4e7cb raft: fix a test file name 2019-08-16 08:10:07 -07:00
4090edfb5b raft: document problem with leader self-removal
When a leader removes itself, it will retain its leadership but not
accept new proposals, making the range effectively stuck until manual
intervention triggers a campaign event.

This commit documents the behavior. It does not correct it yet.
2019-08-16 08:09:56 -07:00
078caccce5 raft: add a batch of interaction-driven conf change tests
Verifiy the behavior in various v1 and v2 conf change operations.
This also includes various fixups, notably it adds protection
against transitioning in and out of new configs when this is not
permissible.

There are more threads to pull, but those are left for future commits.
2019-08-16 08:09:44 -07:00
d177b7f6b4 raft: proactively probe newly added followers
When the leader applied a new configuration that added voters, it would
not immediately probe these voters, delaying when they would be caught
up.

I noticed this while writing an interaction-driven test, which has now
been cleaned up and completed.
2019-08-16 08:09:33 -07:00
2c1a1d8c32 rafttest: add _breakpoint directive
It is a helper case to attach a debugger to when a problem needs
to be investigated in a longer test file. In such a case, add the
following stanza immediately before the interesting behavior starts:

_breakpoint:
----
ok

and set a breakpoint on the _breakpoint case.
2019-08-16 08:09:23 -07:00
0fc108428e raft: initialize new Progress at LastIndex, not LastIndex+1
Initializing at LastIndex+1 meant that new peers would not be probed
immediately when they appeared in the leader's config, which delays
their getting caught up.
2019-08-16 08:09:11 -07:00
df489e7a2c raft/rafttest: fix stabilize handler
It was bailing out too early.
2019-08-16 08:08:28 -07:00
f13a5102ec tests/e2e: fix version matching
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 14:46:19 -07:00
c9465f51d2 *: use Go 1.12.9
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 14:40:46 -07:00
8f85f0dc26 version: 3.4.0-rc.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 13:45:25 -07:00
0161e72d8d mvcc: keep 64-bit alignment in "store" struct
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 13:31:52 -07:00
1691eec2db clientv3/integration: fix "mvcc.NewStore" call
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 13:31:46 -07:00
1e213b7ab6 *: Add experimental-compaction-batch-limit flag
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 13:31:39 -07:00
b30c1eb2c8 mvcc: Optimize compaction for short commit pauses 2019-08-15 13:29:28 -07:00
a0be90f450 Documentation/upgrades: update
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-14 17:01:19 -07:00
8110a96f69 scripts/release: clean up minor tag docker commands
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 22:01:10 -07:00
8e05c73fa7 Makefile: explicit about GOOS in docker-test builds
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 16:57:22 -07:00
970ca9fa43 Documentation/upgrades: highlight "--enable-v2=false"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 15:32:46 -07:00
a481ee809f vendor: update "net/http2" to latest
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 14:44:59 -07:00
4d06d3b498 vendor: upgrade grpc-go to 1.23.0
https://github.com/grpc/grpc-go/releases/tag/v1.23.0

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 14:44:53 -07:00
98462b52d1 *: use Go 1.12.8
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 12:56:11 -07:00
2a8d09b83b clientv3: use Endpoints(), fix context creation
If overwritten, the previous context should be canceled first.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 12:43:49 -07:00
49c6e87f74 version: 3.4.0-pre
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 12:43:40 -07:00
84ed0f7f87 version: 3.4.0-rc.0
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 10:06:34 -07:00
52d34298ab scripts: remove ".aci" commands
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 10:06:24 -07:00
9c1d2eaee4 scripts/release: fix version check commands
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 09:59:24 -07:00
547631a492 scripts: fix build docker commands, add more logging
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 09:50:21 -07:00
802e01a0d8 *: remove "acbuild"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 09:50:21 -07:00
1dff1c869f scripts/release: fix "yq" command
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 09:50:18 -07:00
ac6b604bb8 raft/rafttest: introduce datadriven testing
It has often been tedious to test the interactions between multi-member
Raft groups, especially when many steps were required to reach a certain
scenario. Often, this boilerplate was as boring as it is hard to write
and hard to maintain, making it attractive to resort to shortcuts
whenever possible, which in turn tended to undercut how meaningful and
maintainable the tests ended up being - that is, if the tests were even
written, which sometimes they weren't.

This change introduces a datadriven framework specifically for testing
deterministically the interaction between multiple members of a raft group
with the goal of reducing the friction for writing these tests to near
zero.

In the near term, this will be used to add thorough testing for joint
consensus (which is already available today, but wildly undertested),
but just converting an existing test into this framework has shown that
the concise representation and built-in inspection of log messages
highlights unexpected behavior much more readily than the previous unit
tests did (the test in question is `snapshot_succeed_via_app_resp`; the
reader is invited to compare the old and new version of it).

The main building block is `InteractionEnv`, which holds on to the state
of the whole system and exposes various relevant methods for
manipulating it, including but not limited to adding nodes, delivering
and dropping messages, and proposing configuration changes. All of this
is extensible so that in the future I hope to use it to explore the
phenomena discussed in

https://github.com/etcd-io/etcd/issues/7625#issuecomment-488798263

which requires injecting appropriate "crash points" in the Ready
handling loop. Discussions of the "what if X happened in state Y"
can quickly be made concrete by "scripting up an interaction test".

Additionally, this framework is intentionally not kept internal to the
raft package.. Though this is in its infancy, a goal is that it should
be possible for a suite of interaction tests to allow applications to
validate that their Storage implementation behaves accordingly, simply
by running a raft-provided interaction suite against their Storage.
2019-08-12 08:10:29 -07:00
69c97cdc8f vendor: bump datadriven
Picks up some fixes for papercuts.
2019-08-12 08:10:19 -07:00
faa71d89d4 cleanup: correct summary message in put.go 2019-08-12 08:07:33 -07:00
64c16779c0 tests/e2e: pass "rc.0"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 01:46:58 -07:00
8ff71c52db test: fix metric name typo 2019-08-09 13:24:27 -07:00
dbe5198c45 raft: fix restoring joint configurations
While writing interaction tests for joint configuration changes, I
realized that this wasn't working yet - restoring had no notion of
the joint configuration and was simply dropping it on the floor.

This commit introduces a helper `confchange.Restore` which takes
a `ConfState` and initializes a `Tracker` from it.

This is then used both in `(*raft).restore` as well as in `newRaft`.
2019-08-09 11:18:40 -07:00
39d0f4e53c confchange: clean up unnecessary block 2019-08-09 11:18:30 -07:00
a8b4213ec0 raft : newRaft() does check for validity of Config 2019-08-09 11:18:06 -07:00
a945379ce4 raft/tracker: visit Progress in stable order
This is helpful for upcoming testing work which allows datadriven
testing of the interaction of multiple nodes. This testing requires
determinism to work correctly.
2019-08-09 08:39:52 -07:00
7a50cd7074 raft/auorum: remove unused type 2019-08-09 08:39:44 -07:00
f786b6ba16 etcdserver: add "etcd_server_snapshot_apply_in_progress_total"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 14:02:13 -07:00
1c8ab76333 integration: test snapshot inflights metrics
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 14:01:54 -07:00
abdb7ca17b etcdserver/api: add "etcd_network_snapshot_send_inflights_total", "etcd_network_snapshot_receive_inflights_total"
Useful for deciding when to terminate the unhealthy follower.
If the follower is receiving a leader snapshot, operator may wait.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 14:01:45 -07:00
629cb7aa5e agent: fix a data race and deadlock
add 1-size buffer for `errc`  to avoid deadlock of child goroutine
add a local variable to a void data race in `err`
when `case <-stream.Context().Done():` is taken
2019-08-08 12:23:08 -07:00
89e102365d Documentation/op-guide: update runtime configuration
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 09:25:29 -07:00
9018b3dc4d raft: let learners vote
It turns out that that learners must be allowed to cast votes.

This seems counter- intuitive but is necessary in the situation in which
a learner has been promoted (i.e. is now a voter) but has not learned
about this yet.

For example, consider a group in which id=1 is a learner and id=2 and
id=3 are voters. A configuration change promoting 1 can be committed on
the quorum `{2,3}` without the config change being appended to the
learner's log. If the leader (say 2) fails, there are de facto two
voters remaining. Only 3 can win an election (due to its log containing
all committed entries), but to do so it will need 1 to vote. But 1
considers itself a learner and will continue to do so until 3 has
stepped up as leader, replicates the conf change to 1, and 1 applies it.

Ultimately, by receiving a request to vote, the learner realizes that
the candidate believes it to be a voter, and that it should act
accordingly. The candidate's config may be stale, too; but in that case
it won't win the election, at least in the absence of the bug discussed
in:
https://github.com/etcd-io/etcd/issues/7625#issuecomment-488798263.
2019-08-08 09:10:21 -07:00
b9bea9def7 functional/agent: copy file, instead of renaming
To retain failure logs in CI testing.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 09:09:39 -07:00
d2675c13f4 functional/rpcpb: make client log less verbose
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 09:09:34 -07:00
8230536171 functional.yaml: try lower snapshot count for flaky tests, error threshold
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 09:09:29 -07:00
524278c187 pkg/types: Avoid potential double lock of tsafeSet.
(tsafeSet).Sub and (tsafeSet).Equals can cause double lock bug if ts and other is pointing the same variable

gofmt the code and add some comments
2019-08-07 16:02:24 -07:00
29cdc9abfc test: output etcd server logs when functional tests fail
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-07 10:16:44 -07:00
a6a9a71b6a integration: fix a data race about err
don't share `err` between goroutines
2019-08-06 16:15:27 -07:00
8c8f6f4b01 mvcc: fix typo in test
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-06 15:09:55 -07:00
b6cfaf883b v3rpc: fix a typo err
don't read return value in child goroutine which causes data race.
2019-08-06 15:09:47 -07:00
b522281a98 stream: Prevent panic when newAttemptLocked fails to get a transport for the new attempt
Testing https://github.com/grpc/grpc-go/pull/2958

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-06 15:09:42 -07:00
a78793e6bf vendor: update gRPC to latest
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-06 15:09:37 -07:00
e09528aa06 Merge pull request #10988 from wenjiaswe/automated-cherry-pick-of-#10987-upstream-release-3.4
Automated cherry pick of #10987
2019-08-05 23:31:33 -07:00
cb4507d15b functional:update go.etcd.io/etcd link and go image registry for functional test 2019-08-05 23:28:45 -07:00
4cead3c25c Merge pull request #10986 from wenjiaswe/automated-cherry-pick-of-#10985-upstream-release-3.4
Automated cherry pick of #10985
2019-08-05 22:45:31 -07:00
3ac41644cc functional test: Update functional README.md 2019-08-05 22:12:50 -07:00
0564743c9b CHANGELOG: remove from release branch
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 14:39:18 -07:00
9d927afead Documentation/upgrades: highlight "grpc.ErrClientConnClosing"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 14:38:51 -07:00
5d19b96341 proxy/grpcproxy: deprecate "grpc.ErrClientConnClosing"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 14:38:44 -07:00
faa1d9d206 functional: deprecate "grpc.ErrClientConnClosing"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 14:38:35 -07:00
ab1db0dfd8 clientv3: deprecate "grpc.ErrClientConnClosing"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 14:38:27 -07:00
1c312cefbd functional: use Go 1.12.7 as default
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 12:40:50 -07:00
b4fcaad87d pkg/adt: remove TODO
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 00:25:02 -07:00
3468505e38 clientv3: document "WithBlock" dial option
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-04 23:53:02 -07:00
a2d68dd389 travis: do not allow CPU 4 test failures
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-04 23:34:31 -07:00
c6e9699960 travis: do not run coverage, tip tests in v3.4
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-04 23:33:13 -07:00
b05dfeb15e scripts/release: remove acbuild commands
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-04 23:21:51 -07:00
bb7df24af4 pkg/adt: fix interval tree black-height property based on rbtree
Author: xkey <xk33430@ly.com>
ref. https://github.com/etcd-io/etcd/pull/10978

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-04 23:15:09 -07:00
9ff86fe516 tests/e2e: skip release tests until release candidate
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-03 00:09:10 -07:00
bc9a54beae tests/e2e: fix upgrade, metrics tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-02 15:58:25 -07:00
df1d3f7c6e functional: remove "embed" support in tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-02 15:58:21 -07:00
14053ba7f7 etcdserver/api: enable 3.4 capability
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-02 15:24:40 -07:00
040f2c5526 version: 3.4.0-pre
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 16:05:22 -07:00
f1c7fd3d53 functional: add "LogLevel" flags
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 15:58:01 -07:00
22a3ec3ac5 CHANGELOG-3.4: highlight version string change
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 15:30:04 -07:00
4244ea4390 CHANGELOG: update with latest changes, make language consistent
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 15:26:31 -07:00
d239b21d10 Documentation/upgrades: update 3.4 guides
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 15:26:16 -07:00
b679c12a51 Merge pull request #10968 from gyuho/mmm
mvcc: add "etcd_mvcc_range_total", "etcd_mvcc_txn_total"
2019-08-01 14:46:49 -07:00
328fdc2150 mvcc: add TODOs
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 14:45:21 -07:00
f82e23ab52 mvcc: add "etcd_mvcc_range_total", "etcd_mvcc_txn_total"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 14:44:55 -07:00
dde3c5fc40 mvcc: clean up metrics names, add missing register calls
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 14:44:55 -07:00
05b2f967c2 Merge pull request #10969 from gyuho/maintainers
MAINTAINERS: add @spzala
2019-08-01 14:44:12 -07:00
8d88fea0c8 MAINTAINERS: add @spzala
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 14:24:24 -07:00
c9bd8db46a CHANGELOG: fix typos
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 14:11:50 -07:00
6804bd8af4 CHANGELOG: add latest metrics change
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 13:53:12 -07:00
d5bd600aa5 CHANGELOG: update "pkg/adt"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 13:37:28 -07:00
3b631e1bb6 pkg/adt: document textbook implementation with pseudo-code
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-01 12:58:47 -07:00
456c91b63a Merge pull request #10959 from gyuho/adt
pkg/adt: refactor + add more test cases
2019-08-01 12:22:15 -07:00
5ef8f2770c Merge pull request #10962 from hexfusion/promote_mvcc
metrics: promote etcd_debugging_mvcc put_total and delete_total
2019-07-31 22:24:40 -07:00
6a0811a949 *: use new adt.IntervalTree interface
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-31 22:23:13 -07:00
3cc3affedd pkg/adt: mask test failure, add TODO
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-31 22:20:59 -07:00
f46ee91863 metrics: promote etcd_debugging_mvcc put_total and delete_total
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-08-01 01:28:07 +00:00
46f04b3c15 pkg/adt: add "IntervalTree.Delete" failure case
Described in https://github.com/etcd-io/etcd/issues/10877.

"black-height" property: Every path from a node to any descendant leaf node must have the same number of black nodes.

Expected

    After deleting 11 (requires rebalancing):
                            [510,511]
                             /      \
                   ----------        --------------------------
                  /                                            \
              [383,384]                                       [830,831]
              /       \                                      /          \
             /         \                                    /            \
      [261,262](red)  [410,411]                     [647,648]           [899,900](red)
          /               \                              \                      /    \
         /                 \                              \                    /      \
      [82,83]           [292,293]                      [815,816](red)   [888,889]    [972,973]
            \                                                           /
             \                                                         /
          [238,239](red)                                       [953,954](red)

Got

    After deleting 11 (requires rebalancing):
                            [510,511]
                             /      \
                   ----------        --------------------------
                  /                                            \
              [82,83]                                       [830,831]
                    \                                      /          \
                     \                                    /            \
                  [383,384]                        [647,648]            [899,900]
                  /       \                              \                  /    \
                 /         \                              \                /      \
           [261,262]      [410,411]                      [815,816]   [888,889]    [972,973]
             /   \                                                                  /
            /     \                                                                /
     [238,239]   [292,293]                                                  [953,954]

This violates "black-height" property.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-31 10:05:32 -07:00
f2742d6cd4 pkg/adt: test node "11" deletion
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-31 10:05:32 -07:00
1d638bad72 pkg/adt: README "IntervalTree.Delete" test case images
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-31 10:05:32 -07:00
19d69d2563 pkg/adt: README initial commit
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-31 10:05:27 -07:00
6917c495e8 pkg/adt: add "visitLevel", make "IntervalTree" interface, more tests
Make "IntervalTree" an interface to abstract range tree interface

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-30 15:59:59 -07:00
149e5dc291 etcdserver: mark flag as experimental, add to changelog and configuration 2019-07-30 16:57:57 -04:00
03fd396610 pkg/transport: Improved description of flag peer-skip-client-san-verification 2019-07-30 16:57:57 -04:00
2f476f2b5a pkg/transport: Added test for SkipClientVerify flag. 2019-07-30 16:57:57 -04:00
1b048c91ec etcdserver: Added configuration flag --peer-skip-client-verify=true 2019-07-30 16:57:57 -04:00
a2a8887c33 Merge pull request #10953 from gyuho/grpc-gateway
vendor: update grpc-ecosystem
2019-07-30 13:31:44 -07:00
465592a718 Documentation/etcd-mixin: Add an alert for down etcd members
An etcd member being down is an important failure state - while
normal admin operations may cause transient outages to rotate,
when any member is down the cluster is operating in a degraded
fashion. Add an alert that records when any members are down
so that administrators know whether the next failure is fatal.

The rule is more complicated than `up{...} == 0` because not all
failure modes for etcd may have an `up{...}` entry for each member.
For instance, a Kubernetes service in front of an etcd cluster
might only have 2 endpoints recorded in `up` because the third
pod is evicted by the kubelet - the cluster is degraded but
`count(up{...})` would not return the full quorum size. Instead,
use network peer send failures as a failure detector and attempt
to return the max of down services or failing peers. We may
undercount the number of total failures, but we will at least
alert that a member is down.
2019-07-30 14:39:50 -04:00
12c049e6be Merge pull request #10835 from spzala/securityprocess
Security: Create etcd security process
2019-07-30 14:14:46 -04:00
bc95b1fa84 bill-of-materials: update
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 21:41:47 -07:00
80efba3368 tests/e2e: fix curl proclaim error message
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 21:28:15 -07:00
f3bca1db08 vendor: update grpc-ecosystem
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 16:13:54 -07:00
800e7235eb CHANGELOG: add recent changes in logger
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 16:09:06 -07:00
6e766ac5fb Merge pull request #10947 from gyuho/log-level
*: make log level configurable
2019-07-29 16:06:51 -07:00
4e43a082b2 raft: use mutex in "SetLogger" to avoid race conditions in tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 15:43:19 -07:00
c6e3401255 etcdserver: make raft log configured by top level logger
To make it consistent

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 15:43:19 -07:00
abba5421f5 Documentation/op-guide: add "--log-level" flag
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 15:43:19 -07:00
a37f3441f5 etcdmain: add "--log-level" flag
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 15:43:19 -07:00
b9de4bddda embed: add "LogLevel", deprecate "Debug" in v3.5
Make log level configurable, and deprecate "debug" flag in v3.5.
And adds more warnings on flags that's being deprecated in v3.5.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 15:43:19 -07:00
e911f901a6 pkg/logutil: add log level utilities
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 15:43:19 -07:00
348b0d40a6 embed: do not expose "zapLoggerBuilder"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 15:43:19 -07:00
324952c12a Merge pull request #10935 from gyuho/v2
*: disable v2 API by default
2019-07-29 15:42:56 -07:00
936c506e8d Merge pull request #10945 from tbg/add-todo
raft: leave TODO about leaving StateSnapshot
2019-07-29 13:51:38 -07:00
4ca04ba991 Merge pull request #10949 from gyuho/docs
Documentation: move design docs to "Documentation", remove "docs"
2019-07-29 13:48:38 -07:00
87e203a5cf Documentation/learning: rewrite balancer design doc images
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-29 13:47:25 -07:00
ad491c0c32 Documentation: move client, learner design docs
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-28 21:54:22 -07:00
3fc62ca586 tools,Documentation: move "etcd-dump-metrics" output to "Documentation"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-28 21:53:50 -07:00
101a63ae97 docs: remove
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-28 21:53:31 -07:00
9e75f27985 CHANGELOG: move corrupt check features to etcd v4
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-28 23:09:42 -05:00
2f30e9ad7f etcdserver: document v2 usage in "publish" method
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-28 21:07:39 -05:00
ae87b21a72 tests/e2e: enable-v2 for v2 e2e tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-28 21:07:36 -05:00
38128425b2 Documentation/op-guide: disable v2 by default
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-28 19:36:51 -05:00
ecb915617d embed: disable v2 by default
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-28 19:36:51 -05:00
89d400211d Merge pull request #10911 from gyuho/balancer
clientv3: fix secure endpoint failover, refactor with gRPC 1.22 upgrade
2019-07-26 14:50:54 -07:00
3b02d4c5ff raft: leave TODO about leaving StateSnapshot
The condition is overly strict, which has popped up in CockroachDB
recently.
2019-07-26 23:19:34 +02:00
a7b8034e56 words: whitelist more
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 09:19:22 -07:00
8a2a951d79 integration: match code.Canceled in "TestV3KVInflightRangeRequests"
Match new error codes in gRPC v1.22.0

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 09:19:22 -07:00
ba42e65b59 clientv3/integration: give more time for balancer resolution
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 09:19:22 -07:00
8c7c6ec0c1 clientv3/balancer: refactor
refactor + remove unused

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 09:19:19 -07:00
3dc00ab615 clientv3: move auth token credential to "credentials" package
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 09:17:44 -07:00
db61ee106c clientv3/credentials: set dial target "Authority" with target address
Overwrite authority when it's IP.

When user dials with "grpc.WithDialer", "grpc.DialContext" "cc.parsedTarget"
update only happens once. This is problematic, because when TLS is enabled,
retries happen through "grpc.WithDialer" with static "cc.parsedTarget" from
the initial dial call.
If the server authenticates by IP addresses, we want to set a new endpoint as
a new authority. Otherwise
"transport: authentication handshake failed: x509: certificate is valid for 127.0.0.1, 192.168.121.180, not 192.168.223.156"
when the new dial target is "192.168.121.180" whose certificate host name is also "192.168.121.180"
but client tries to authenticate with previously set "cc.parsedTarget" field "192.168.223.156"

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 09:17:40 -07:00
a6b105a907 embed: use new "credentials" package
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 08:56:45 -07:00
7cbe2f5dd6 etcdserver/api/v3rpc: use new "credentials" package
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 08:56:38 -07:00
db7231accc clientv3: use new "credentials" package
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 08:56:33 -07:00
324c876742 clientv3/credential: implement grpc/credentials.Bundle
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 08:56:05 -07:00
4707d7a196 vendor: upgrade grpc-go to v1.22.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 08:53:09 -07:00
12ab2ee3c4 clientv3: do not use pointer to TransportCredentials interface
Interface in Go is already reference

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 08:52:31 -07:00
50babc16e7 etcdserver/api/v2v3: skip tests for CI
To fix in v3.5

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 05:54:58 -07:00
cf4b5d9c7f Merge pull request #10938 from MasahikoSawada/updategoversion
README: require Go 1.12+
2019-07-25 22:59:34 -07:00
222dcc8d13 README: require Go 1.12+
Fixes #10937
2019-07-26 14:31:24 +09:00
e36e3ac6a7 Merge pull request #10917 from gyuho/raft-node
raft: improve logging around tick miss
2019-07-25 22:17:28 +02:00
388d15f521 Merge pull request #10622 from philips/add-v2v3-tests
etcdserver: api/v2v3: add initial tests
2019-07-25 10:05:29 -07:00
84a38045c9 CHANGELOG: move "--enable-v2v3" to 3.5
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-25 09:29:31 -07:00
c5dba11197 Merge pull request #10933 from ChrisRx/fix-embed-panic
embed: fix oob panic in zap logger
2019-07-25 06:56:31 -07:00
2223142685 embed: fix oob panic in zap logger
This fixes an index out-of-bounds panic caused when using the embed
package and the zap logger. When a TLS handshake error is logged, the
slice for cert ip addresses is allocated with capacity but no length, so
subsequent index access causes the panic, and doesn't surface the TLS
handshake error to the user.

Fixes #10932
2019-07-25 09:42:42 -04:00
1782469a76 Merge pull request #10928 from spzala/changelogdep
changelog: reflect the latest vendor dependencies
2019-07-25 06:19:13 -07:00
c63c988d03 changelog: reflect the latest vendor dependencies
Vendor dependencies are modified under,
https://github.com/etcd-io/etcd/pull/10918

Also, the protobuf module is mentioned twice so removing one.
2019-07-25 02:52:34 -04:00
c7c9428f6b raft: move "RawNode", clarify tick miss
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-24 23:35:36 -07:00
8f000c755b Merge pull request #10918 from gyuho/zap-go
vendor: upgrade dependencies
2019-07-24 21:59:36 -07:00
425b65467c Merge pull request #10907 from spzala/emptyrole10905
etcdserver: do not allow creating empty role
2019-07-24 18:51:43 -07:00
1cef112a79 etcdserver: do not allow creating empty role
Like user, we should not allow creating empty role.

Related #10905
2019-07-24 17:41:24 -04:00
46166ad733 vendor: update
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-24 14:09:50 -07:00
d137fa9d4a Merge pull request #10920 from tbg/rawnode-ready
raft: require app to consume result from Ready()
2019-07-24 11:46:55 +02:00
8b752ef647 CHANGELOG: update latest changes
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-23 15:36:14 -07:00
721127da12 raft: require app to consume result from Ready()
I changed `(*RawNode).Ready`'s behavior in #10892 in a problematic way.
Previously, `Ready()` would create and immediately "accept" a Ready
(i.e. commit the app to actually handling it). In #10892, Ready() became
a pure read-only operation and the "accepting" was moved to
`Advance(rd)`.  As a result it was illegal to use the RawNode in certain
ways while the Ready was being handled. Failure to do so would result in
dropped messages (and perhaps worse). For example, with the following
operations

1. `rd := rawNode.Ready()`
2. `rawNode.Step(someMsg)`
3. `rawNode.Advance(rd)`

`someMsg` would be dropped, because `Advance()` would clear out the
outgoing messages thinking that they had all been handled by the client.
I mistakenly assumed that this restriction had existed prior, but this
is incorrect.

I noticed this while trying to pick up the above PR in CockroachDB,
where it caused unit test failures, precisely due to the above example.

This PR reestablishes the previous behavior (result of `Ready()` must
be handled by the app) and adds a regression test.

While I was there, I carried out a few small clarifying refactors.
2019-07-23 22:45:01 +02:00
a91f4e45c0 Merge pull request #10919 from gyuho/ci
*: test with Go 1.12.7
2019-07-23 13:09:01 -07:00
97bd2b3262 Update CHANGELOG-3.4.md for PR #10805 2019-07-23 12:48:28 -07:00
08db37db54 Security: Create etcd security process
Create security disclosure and release process, and team to handle issues.

Related # https://github.com/etcd-io/maintainers/issues/1
2019-07-23 15:43:15 -04:00
dfd62f04e9 *: test with Go 1.12.7
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-23 12:40:47 -07:00
a9047850de Merge pull request #10805 from wenjiaswe/rebase-distroless
Dockerfile: Rebase etcd image to debian
2019-07-23 11:42:48 -07:00
d1183424bb Merge pull request #10914 from tbg/api
raft: allow use of joint quorums
2019-07-23 11:49:08 +02:00
b9c051e7a7 raftpb: clean up naming in ConfChange 2019-07-23 10:40:03 +02:00
b67303c6a2 raft: allow use of joint quorums
This change introduces joint quorums by changing the Node and RawNode
API to accept pb.ConfChangeV2 (on top of pb.ConfChange).

pb.ConfChange continues to work as today: it allows carrying out a
single configuration change. A pb.ConfChange proposal gets added to
the Raft log as such and is thus also observed by the app during Ready
handling, and fed back to ApplyConfChange.

ConfChangeV2 allows joint configuration changes but will continue to
carry out configuration changes in "one phase" (i.e. without ever
entering a joint config) when this is possible.
2019-07-23 10:40:03 +02:00
88f5561733 raft: use ConfChangeSingle internally 2019-07-23 10:39:48 +02:00
10680744b9 raft: introduce protos for joint quorums 2019-07-23 10:39:48 +02:00
f856ce963b Dockerfile: rebase etcd image to debian 2019-07-22 18:58:16 -07:00
fe86a786a4 Merge pull request #10904 from yuzeming/fixed-10899
integration: fix a data race about `i` and `tt` in TestV3WatchFromCur…
2019-07-19 17:51:21 -07:00
39680c381e Merge pull request #10908 from tbg/log
etcdserver: fix createConfChangeEnts
2019-07-19 17:45:27 -07:00
9a69aa17c8 Merge pull request #10614 from jmillikin-stripe/cert-allowed-san-flags
etcdmain, pkg: Support peer and client TLS auth based on SAN fields.
2019-07-19 12:02:28 -07:00
eb4d9b640a etcdserver: fix createConfChangeEnts
It created a sequence of conf changes that could intermittently cause an
empty set of voters, which Raft asserts against as of #10889.

This fixes TestCtlV2BackupSnapshot and TestCtlV2BackupV3Snapshot, see:
https://github.com/etcd-io/etcd/issues/10700#issuecomment-512358126
2019-07-19 17:13:08 +02:00
3c5e2f51e4 Merge pull request #10892 from tbg/rawnode-everywhere-attempt3
raft: use RawNode for node's event loop; clean up bootstrap
2019-07-19 14:30:08 +02:00
caa48bcc3d raft: remove TestNodeBoundedLogGrowthWithPartition
It has a data race between the test's call to `reduceUncommittedSize`
and a corresponding call during Ready handling in `(*node).run()`.
The corresponding RawNode test still verifies the functionality, so
instead of fixing the test we can remove it.
2019-07-19 12:35:14 +02:00
500af91653 raft: restore ability to bootstrap RawNode
We are worried about breaking backwards compatibility for any
application out there that may have relied on the old behavior. Their
RawNode invocation would have been broken by the removal of the peers
argument so it would not have changed silently; an associated comment
tells callers how to fix it.
2019-07-19 10:02:02 +02:00
c9491d7861 raft: clean up bootstrap
This is the first (maybe not last) step in cleaning up the bootstrap
code around StartNode.

Initializing a Raft group for the first time is awkward, since a
configuration has to be pulled from thin air. The way this is solved
today is unclean: The app is supposed to pass peers to StartNode(),
we add configuration changes for them to the log, immediately pretend
that they are applied, but actually leave them unapplied (to give the
app a chance to observe them, though if the app did decide to not apply
them things would really go off the rails), and then return control to
the app. The app will then process the initial Readys and as a result
the configuration will be persisted to disk; restarts of the node then
use RestartNode which doesn't take any peers.

The code that did this lived awkwardly in two places fairly deep down
the callstack, though it was really only necessary in StartNode(). This
commit refactors things to make this more obvious: only StartNode does
this dance now. In particular, RawNode does not support this at all any
more; it expects the app to set up its Storage correctly.

Future work may provide helpers to make this "preseeding" of the Storage
more user-friendly. It isn't entirely straightforward to do so since
the Storage interface doesn't provide the right accessors for this
purpose. Briefly speaking, we want to make sure that a non-bootstrapped
node can never catch up via the log so that we can implicitly use one
of the "skipped" log entries to represent the configuration change into
the bootstrap configuration. This is an invasive change that affects
all consumers of raft, and it is of lower urgency since the code (post
this commit) already encapsulates the complexity sufficiently.
2019-07-19 10:02:02 +02:00
c62b7048b5 raft: use RawNode for node's event loop
It has always bugged me that any new feature essentially needed to be
tested twice due to the two ways in which apps can use raft (`*node` and
`*RawNode`). Due to upcoming testing work for joint consensus, now is a
good time to rectify this somewhat.

This commit removes most logic from `(*node).run` and uses `*RawNode`
internally. This simplifies the logic and also lead (via debugging) to
some insight on how the semantics of the approaches differ, which is now
documented in the comments.
2019-07-19 09:59:59 +02:00
5f21b557e5 integration: fix a data race about i and tt in TestV3WatchFromCurrentRevision
don't references to loop variables from within go anonymous function

Fixes etcd-io#10899
2019-07-19 00:48:49 -07:00
233be58056 Merge pull request #10839 from needkane/pr
raft: update log info and annotation
2019-07-18 23:26:44 -07:00
62f4fb3c5e Merge pull request #10903 from tbg/inflights
raft: return non-nil Inflights in raft status
2019-07-18 17:50:28 +02:00
6b0322549f raft: replace StatusWithoutProgress with BasicStatus
Now that a Config is also added to the full status, the old name
did not convey the intention, which was to get a Status without
an associated allocation.
2019-07-18 16:28:37 +02:00
f498392ca7 Merge pull request #10898 from tbg/dep
scripts: fail explicitly in updatedep.sh when gopath.proto exists
2019-07-17 16:47:51 -07:00
7d2e57216a Merge pull request #10900 from yuzeming/master
integration: add WaitGroup to TestV3WatchCurrentPutOverlap
2019-07-17 16:46:25 -07:00
yzm
3737979532 move wg.Wait() after loop 2019-07-17 16:31:48 -07:00
7ce934cbec raft: return active config in Status
This is useful for debug purposes, and more so once we support joint
quorums.
2019-07-17 14:29:45 +02:00
26a1e60eab raft: return non-nil Inflights in raft status
Recent refactoring to the String() method of `Progress` hit an NPE
because we return nil Inflights as part of the Raft status. Just
fix this at the source and properly populate the Raft status instead
of teaching String() to ignore nil. A real Progress always has a
non-nil Inflights.
2019-07-17 12:53:28 +02:00
yzm
d87bd2c87c integration: add WaitGroup to prevent calling t.Fatalf after TestV3WatchCurrentPutOverlap function return
It could cause a panic when it happens

Fixes #10886
2019-07-16 10:25:35 -07:00
9fba06ba3b Merge pull request #10889 from tbg/joint-conf-change-logic
raft: internally support joint consensus
2019-07-16 16:02:16 +02:00
aa158f36b9 raft: internally support joint consensus
This commit introduces machinery to safely apply joint consensus
configuration changes to Raft.

The main contribution is the new package, `confchange`, which offers
the primitives `Simple`, `EnterJoint`, and `LeaveJoint`.

The first two take a list of configuration changes. `Simple` only
declares success if these configuration changes (applied atomically)
change the set of voters by at most one (i.e. it's fine to add or
remove any number of learners, but change only one voter). `EnterJoint`
makes the configuration joint and then applies the changes to it, in
preparation of the caller returning later and transitioning out of the
joint config into the final desired configuration via `LeaveJoint()`.

This commit streamlines the conversion between voters and learners, which
is now generally allowed whenever the above conditions are upheld (i.e.
it's not possible to demote a voter and add a new voter in the context
of a Simple configuration change, but it is possible via EnterJoint).
Previously, we had the artificial restriction that a voter could not be
demoted to a learner, but had to be removed first.
Even though demoting a learner is generally less useful than promoting
a learner (the latter is used to catch up future voters), demotions
could see use in improved handling of temporary node unavailability,
where it is desired to remove voting power from a down node, but to
preserve its data should it return.

An additional change that was made in this commit is to prevent the use
of empty commit quorums, which was previously possible but for no good
reason; this:

Closes #10884.

The work left to do in a future PR is to actually expose joint
configurations to the applications using Raft. This will entail mostly
API design and the addition of suitable testing, which to be carried
out ergonomically is likely to motivate a larger refactor.

Touches #7625.
2019-07-16 15:36:04 +02:00
14625b847c scripts: have genproto.sh clean up after itself
We don't want it to leave gopath.proto around for reasons detailed in
the previous commit (messing up vgo).
2019-07-16 14:01:04 +02:00
f63984bb33 scripts: fail explicitly in updatedep.sh when gopath.proto exists
I had been dealing with these intermittent failures for a while and
finally figured out why. The real solution is making genproto.sh less
ugly but that won't happen for a while.
2019-07-16 13:54:09 +02:00
5a734e79f5 Merge pull request #10891 from changkun/raft
raft/rafttest: simulate async send in node test
2019-07-15 11:49:06 -07:00
856097181b raft/rafttest: simulate async send in node test
In order to cover message can well be received when a node is paused, this commit sends message async using goroutine and random sleep. This change makes recvms is possible to cache message during node.pause is triggered.
2019-07-13 16:22:33 +02:00
e56e8471ec Merge pull request #10888 from tbg/test-ski
test: allow failures in linux-amd64-integration-4-cpu
2019-07-11 09:24:06 -07:00
b7327b1cd8 test: allow failures in linux-amd64-integration-4-cpu
This run *should* certainly pass, but it's consistently the one that
fails with a regularity that essentially blocks the CI pipeline.

Someone needs to take a look at #10700, but in the meantime, the show
must go on.
2019-07-11 16:40:54 +02:00
b2274efee0 Merge pull request #10864 from tbg/learner-snap
raft: allow voter to become learner through snapshot
2019-07-11 15:48:09 +02:00
95f3138b5f tests: Use more deterministic error message in TestEtcdPeerNameAuth 2019-07-10 14:24:20 +09:00
c6686734b1 tests: Use 'localhost' to match SAN of integration/fixtures/server.crt 2019-07-10 13:33:14 +09:00
91472797ff pkg: Remove stray printfs 2019-07-10 13:33:14 +09:00
5824421f8b etcdman, pkg: Rename new flags to 'hostname' 2019-07-10 09:30:02 +09:00
9a53601a18 etcdmain, pkg: Support peer and client TLS auth based on SAN fields.
Etcd currently supports validating peers based on their TLS certificate's
CN field. The current best practice for creation and validation of TLS
certs is to use the Subject Alternative Name (SAN) fields instead, so that
a certificate might be issued with a unique CN and its logical
identities in the SANs.

This commit extends the peer validation logic to use Go's
`(*"crypto/x509".Certificate).ValidateHostname` function for name
validation, which allows SANs to be used for peer access control.

In addition, it allows name validation to be enabled on clients as well.
This is used when running Etcd behind an authenticating proxy, or as
an internal component in a larger system (like a Kubernetes master).
2019-07-10 09:30:02 +09:00
eb7dd97135 Merge pull request #10882 from tbg/pr-string
raft: optimize string representation of Progress
2019-07-09 16:27:35 +02:00
95024fa3cc raft: optimize string representation of Progress
Make it less verbose by omitting the values for the steady state.
Also rearrange the order so that information that is typically more
relevant is printed first.
2019-07-09 11:22:37 +02:00
0af16979f8 Merge pull request #10879 from lzhfromustc/master
etcdserver: modify a read operation to avoid potential race
2019-07-08 14:48:43 -07:00
d35f6647bc Use newbe instead of s.be to avoid potential race
`s.cluster.SetBackend(s.be)` is not in critical section. Using `newbe` instead of `s.be` can avoid potential data race.
2019-07-08 14:24:52 -07:00
6f009d211f raft: allow voter to become learner through snapshot
At the time of writing, we don't allow configuration changes to change
voters to learners directly, but note that a snapshot may compress
multiple changes to the configuration into one: the voter could have
been removed, then readded as a learner and the snapshot reflects both
changes. In that case, a voter receives a snapshot telling it that it is
now a learner. In fact, the node has to accept that snapshot, or it is
permanently cut off from the Raft log.

I think this just wasn't realized in the original work, but this is just
my guess since there generally is very little rationale on the various
decisions made. I also generally haven't been able to figure out whether
the decision to prevent voters from becoming learners without first
having been removed was motivated by some particular concern, or if it
just wasn't deemed necessary. I suspect it is the latter because
demoting a voter seems perfectly safe.

See https://github.com/etcd-io/etcd/pull/8751#issuecomment-342028091.
2019-07-08 09:32:24 +02:00
48f5bb6d28 Merge pull request #10865 from tbg/multi-conf-change
raft: centralize configuration change application
2019-07-03 21:57:57 +02:00
6697adfff8 raft/tracker: pull Voters and Learners into Config struct
This is helpful to quickly print the configuration log messages without
having to specify Voters and Learners separately.

It will also come in handy for joint quorums because it allows holding
on to voters and learners as a unit, which is useful for unit testing.
2019-07-03 21:26:42 +02:00
b171e1c78b raft: centralize configuration change application
Put all the logic related to applying a configuration change in one
place in preparation for adding joint consensus.

This inspired various TODOs.

I had to rewrite TestSnapshotSucceedViaAppResp since it was relying
on a snapshot applied to the leader, which is now prevented.
2019-07-03 21:26:42 +02:00
4f7d83a249 raft: update log info and annotation 2019-07-02 23:43:56 -04:00
1f40b6642f Merge pull request #10850 from Koprvhdix/role-remove-document-fix
Documentation: change `etcdctl role remove` to `etcdctl role delete`
2019-07-01 15:34:55 -07:00
d506962fec Merge pull request #10848 from spzala/raftthesis10831
raftdoc: fix raft thesis link
2019-06-28 12:43:32 -07:00
ecba4492f2 Merge pull request #10866 from lzhfromustc/master
clientv3: Fixed a missing block bug
2019-06-28 11:54:50 -07:00
8194aa3f03 Fixed a missing block bug
Description: w.mu is locked at line 385 and unlocked at line 396. Among 5 return statements in this function, 4 are below line 396 but there is 1 return at line 387. 
Fix: Add w.mu.Unlock() before that return at line 387.
2019-06-28 11:27:13 -07:00
c34de2aef4 Documentation: change etcdctl role remove to etcdctl role delete
This is a document error. With running `etcdctl role --help`, we can find that it should be delete, not remove.

Fixes #10849
2019-06-26 09:03:08 +08:00
655ab0ac6a raftdoc: fix raft thesis link
The current link does not work and not valid anymore per stanford support.
Replace all current refs with a link that is used by the
https://raft.github.io/

Fixes # https://github.com/etcd-io/etcd/issues/10831
2019-06-24 19:01:00 -04:00
948e276ca7 Merge pull request #10807 from tbg/extract-prs
raft: extract 'tracker' package
2019-06-21 22:50:06 +02:00
f9c2d00fb3 raft: extract 'tracker' package
Mechanically extract `progressTracker`, `Progress`, and `inflights`
to their own package named `tracker`. Add lots of comments in the
progress, and take the opportunity to rename and clarify various
fields.
2019-06-21 22:15:00 +02:00
6953ccc135 Merge pull request #10837 from tbg/ci-20m
test: s/20m/30m/g
2019-06-21 08:54:26 +08:00
362dfb4d08 test: s/20m/30m/g
Every other test build times out due to the 20 minute test timeout. I
doesn't seem like tests are actually hanging, it's more that 20 minutes
just isn't enough to run the tests any more.
2019-06-20 23:44:25 +02:00
5d30dccdaa Merge pull request #10838 from tbg/unbreak
quorum: fix vet failure
2019-06-20 23:44:09 +02:00
e262542d6d quorum: fix vet failure
This slipped in during a rename and I didn't see it in CI because of
CI flakiness and a general intransparency about which failures are
important.
2019-06-20 23:40:08 +02:00
755aab6990 Merge pull request #10779 from tbg/jointq-pr
raft: use half-populated joint quorum
2019-06-20 22:57:46 +02:00
e039629907 raft: use half-populated joint quorum
To ease a future transition into joint quorums, this commit removes the
previous "ad-hoc" majority-based quorum and vote computations with that
introduced in the `raft/quorum` package.

More specifically, the progressTracker now uses a quorum.JointConfig for
which the "second" majority quorum is always empty; in this case the
quorum behaves like the one quorum.MajorityConfig that is actually
present. Or, more briefly, this change is a no-op, but it will take the
busywork out of actually starting to make use of joint quorums in the
future.

On a side node, I suspect that this might've fixed a bug regarding the
read index though I haven't been able to explicitly come up with a
counter-example. The problem was that the acks collected for the read
index weren't taking into account membership changes, so they'd run the
danger of using acks from nodes since removed to claim that a quorum of
acks had been received. There's a chance that there isn't a
counter-example (the only guarantee extracted from the "quorum" is that
there isn't another leader, but even if there's another leader all that
matters is that that leader doesn't have a divergent history from the
stale leader in the hypothetical counter-example), but either way there
is morally a bug here that is now fixed because VoteCommitted doesn't
care about votes from members that are not voters known to the currently
active configuration.
2019-06-19 14:19:35 +02:00
0384c587eb raft: rename makeP{RS,rogressTracker} 2019-06-19 14:19:35 +02:00
3def2364e4 raft: use membership sets in progress tracking
Instead of having disjoint mappings of ID to *Progress for voters and
learners, use a map[id]struct{} for each and share a map of *Progress
among them.

This is easier to handle when joint quorums are introduced, at which
point a node may be a voting member of two quorums.
2019-06-19 14:19:35 +02:00
76c8ca5a55 quorum: introduce library for majority and joint quorums
The quorum package contains logic to reason about committed indexes as
well as vote outcomes for both majority and joint quorums. The package
is oblivious to the existence of learner replicas.

The plan is to hook this up to etcd/raft in subsequent commits.
2019-06-19 14:19:35 +02:00
9ff7628577 Merge pull request #10801 from roe85/master
DOC: Fix referencing the wrong version number.
2019-06-18 13:24:28 -07:00
53891cbf97 Merge pull request #10822 from tbg/airplane/learner-no-campaign
raft: prevent learners from becoming leader
2019-06-17 10:58:24 -07:00
e5876c6ce2 Merge pull request #10826 from yznima/pr-race
Raft HTTP: fix pause/resume race condition
2019-06-17 09:58:18 -07:00
b1812a410f Raft HTTP: fix pause/resume race condition 2019-06-17 11:45:25 -04:00
c844526002 raft: prevent learners from becoming leader
We were already taking some precautions against learners campaigning,
but there was no safeguard against an explicit call to `Campaign()`.
The newly added test also verifies that leadership transfers to
learners are ignored.
2019-06-17 09:20:45 +02:00
c27e1108f4 Doc: Fix referencing the wrong version number. 2019-06-14 14:35:34 +02:00
2c5162af5c Merge pull request #10523 from jingyih/fully_concurrent_reads
mvcc: fully concurrent read
2019-06-14 12:25:17 +08:00
55066ebdc0 mvcc: address comments 2019-06-13 18:05:50 -07:00
0de9b8abf5 Merge pull request #10815 from jingyih/fix_RLock
mvcc/backend: use RLock in test
2019-06-11 23:10:12 -07:00
2a9320e944 mvcc: add TestConcurrentReadTxAndWrite
Add TestConcurrentReadTxAndWrite which creates random reads and writes,
and ensures reads always see latest writes.
2019-06-11 17:05:41 -07:00
b873fbd127 mvcc/backend: correct RLock in test
Should use RLock instead of Lock.
2019-06-11 16:24:44 -07:00
693afd8e5e mvcc/backend: add unit test for ConcurrentReadTx
Add TestConcurrentReadTx to ensure concurrentReadTx can see all the
prior writes which are stored on the read buffer.
2019-06-10 18:30:21 -07:00
ad80752715 mvcc: add metrics dbOpenReadTxn
Expose the number of currently open read transactions in backend to
metrics endpoint.
2019-06-10 17:20:04 -07:00
d6280f9ea5 Merge pull request #10809 from jingyih/update_changelog
CHANGELOG-3.4: update from #10808
2019-06-09 20:59:43 -07:00
6906d07d1f CHANGELOG: update from PR10808 2019-06-09 20:16:13 -07:00
833620b864 Merge pull request #10808 from WIZARD-CXY/updateboltdb2
vendor: update boltdb
2019-06-09 20:08:56 -07:00
a2a077790b vendor: update boltdb 2019-06-10 10:25:59 +08:00
48d144a3de Merge pull request #10731 from WIZARD-CXY/learner_metric
etcdserver: add learner metrics
2019-06-08 22:43:03 -07:00
ea70731f53 Merge pull request #10762 from FrozenAndrey/fix#10747
etcdmain: fix ignoring of ETCD_CONFIG_FILE env variable
2019-06-07 21:32:35 -07:00
f6a9ebe579 Merge pull request #10710 from j2gg0s/refactor-lease-bench-test
lease: refactor lease's benchmark.
2019-06-07 21:32:04 -07:00
372086cca1 Merge pull request #10803 from tbg/rna
raft: make relationship between node and RawNode explicit
2019-06-07 15:43:28 -07:00
b5c6904cea Merge pull request #10802 from jpbetz/jingyih-maintainer
Add @jingyih to MAINTAINERS
2019-06-07 14:59:39 -07:00
54dcb9cf34 Add @jingyih to MAINTAINERS 2019-06-07 14:58:50 -07:00
ea0be95387 Merge pull request #10792 from spzala/triagereflink
README: provide ref for issue and PR managenet doc
2019-06-07 14:09:47 -07:00
cbb7730c26 raft: make relationship between node and RawNode explicit
This will keep them from diverging to much. In fact we should remove
some of the obvious differences that have crept in over time so that
what remains is structural. This isn't done in this commit since
it amounts to a change in the public API; we should lump this in
when we break the public API the next time.
2019-06-07 23:07:42 +02:00
805b918715 Merge pull request #10800 from mitake/curl-test-nopassword
tests/e2e: initialize UserAddOptions{} field in testV3CurlAuth()
2019-06-06 17:39:50 -07:00
7bbc536e1c tests/e2e: initialize UserAddOptions{} field in testV3CurlAuth() 2019-06-06 23:07:41 +09:00
30ca4ae1e2 Merge pull request #10798 from WIZARD-CXY/learnerCLOG
CHANGELOG: add learner metrics
2019-06-06 01:49:49 -07:00
9a73013004 Merge pull request #10797 from jingyih/lease_checkpoint_enabled_by_experimental_flag
*: enable lease checkpoint via experimental flag
2019-06-05 22:56:54 -07:00
489675644a CHANGELOG: add learner metrics 2019-06-06 10:29:57 +08:00
5af3723e28 CHANGELOG: update changelog-3.4 2019-06-05 15:45:58 -07:00
e67b9829b6 *: enable lease checkpoint via experimental flag
Primary lessor persist lease remainingTTL only if experimental flag
"--experimental-enable-lease-checkpoint" is set.
2019-06-05 15:30:03 -07:00
aa016eebf8 Merge pull request #10631 from spzala/goruntimeupdate112
*: test update for Go 1.12.5 and related changes
2019-06-05 15:28:37 -07:00
1caaa9ed4a test: test update for Go 1.12.5 and related changes
Update to Go 1.12.5 testing. Remove deprecated unused and gosimple
pacakges, and mask staticcheck 1006. Also, fix unconvert errors related
to unnecessary type conversions and following staticcheck errors:
- remove redundant return statements
- use for range instead of for select
- use time.Since instead of time.Now().Sub
- omit comparison to bool constant
- replace T.Fatal and T.Fatalf in tests with T.Error and T.Fatalf respectively because the goroutine calls T.Fatal must be called in the same goroutine as the test
- fix error strings that should not be capitalized
- use sort.Strings(...) instead of sort.Sort(sort.StringSlice(...))
- use he status code of Canceled instead of grpc.ErrClientConnClosing which is deprecated
- use use status.Errorf instead of grpc.Errorf which is deprecated

Related #10528 #10438
2019-06-05 17:02:05 -04:00
25412f9690 Merge pull request #10789 from jingyih/update_changelog_from_9540
CHANGELOG: update from cherry picks of #9540
2019-06-05 08:18:29 -07:00
336c01b8d4 README: provide ref for issue and PR managenet doc
related: https://github.com/etcd-io/etcd/pull/10668

Provide a ref to the issue and PR management doc from the README,
similar to other references we have provided in the README.
2019-06-05 10:44:04 -04:00
ea45cd61d0 Merge pull request #10788 from jingyih/add_missing_newline_EndpointHealth
ctlv3: add newline in EndpointHealth output
2019-06-05 02:19:54 -07:00
0b8727b3f3 etcdserver: add learner metrics 2019-06-05 10:51:21 +08:00
6a7ee7063c lease: refactor benchmark. 2019-06-05 10:01:01 +08:00
ceb963e008 CHANGELOG: update from cherry pick of #9540 2019-06-04 16:47:19 -07:00
17e10fe13f ctlv3: add missing newline in EndpointHealth
To make the output consistent with the output before #9540.
2019-06-04 15:52:29 -07:00
5042c2751b Merge pull request #10786 from rohitsardesai83/update_changelog3.3
CHANGELOG-3.3: add sigs.k8s.io yaml dependency
2019-06-04 09:52:53 -07:00
ee2b976254 CHANGELOG-3.3: add sigs.k8s.io yaml dependency 2019-06-04 14:13:57 +05:30
35a67024f6 Merge pull request #10781 from gyuho/vv
module: require 1.12, remove "v3" import paths
2019-06-03 11:18:07 -07:00
b40597ce46 module: require 1.12, remove "v3" import paths
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-06-03 11:15:19 -07:00
ea0f919cdc Merge pull request #10775 from jingyih/integration_member_promote_failed_cases
clientv3/integration: add member promote failure test cases
2019-06-02 20:39:00 -07:00
cdca488d8b Merge pull request #9817 from mitake/no-password
*: support creating a user without password
2019-05-31 22:41:29 -07:00
068864574a Merge pull request #10734 from spzala/securityissues
README: update handling of security vulnerabilities
2019-05-31 17:12:34 -07:00
d8e2e47de5 Merge pull request #10693 from nolouch/fix-lease
lease/lessor: recheck if exprired lease is revoked
2019-05-31 12:13:36 -07:00
ec17872c98 integration: add member promote failure test cases
Add TestMemberPromoteMemberNotLearner and
TestMemberPromoteMemberNotExist.
2019-05-31 10:52:54 -07:00
2f7121b420 Merge pull request #10772 from jingyih/revert_10526
mvcc: revert change made by #10526 and #10699
2019-05-30 15:18:31 -07:00
96a7ff0a62 CHANGELOG: describe about user with no password 2019-05-30 22:03:00 +09:00
54b09d4f87 auth: add a unit test for creating a user with no password 2019-05-30 21:59:30 +09:00
8257dfdb51 e2e: add test cases for a user without password 2019-05-30 21:59:30 +09:00
5a67dd788d *: support creating a user without password
This commit adds a feature for creating a user without password. The
purpose of the feature is reducing attack surface by configuring bad
passwords (CN based auth will be allowed for the user).

The feature can be used with `--no-password` of `etcdctl user add`
command.

Fix https://github.com/coreos/etcd/issues/9590
2019-05-30 21:59:30 +09:00
4345f74426 mvcc: revert change made by 10526
Revert #10526 and its followup #10699.
2019-05-29 17:41:33 -07:00
dc6885d73f Merge pull request #10771 from jingyih/clarify_ETCDCTL_API
doc: clarify etcdctl default API version
2019-05-29 15:32:45 -07:00
e1120a5e3e Merge pull request #10770 from xiang90/rm_a
*: Move Anthony to Emeritus Maintainers
2019-05-29 14:59:54 -07:00
a7568d63a7 doc: clarify etcdctl default API version 2019-05-29 14:48:46 -07:00
85fab97186 *: Move Anthony to Emeritus Maintainers 2019-05-29 12:45:23 -07:00
db5d1209d9 Merge pull request #10767 from jingyih/update_changelog_learner
CHANGELOG: update changelog-3.4 with learner
2019-05-29 10:45:22 -07:00
dc8a31eaf0 lease/lessor: recheck if exprired lease is revoked
Signed-off-by: nolouch <nolouch@gmail.com>
2019-05-29 14:27:19 +08:00
565e83e997 CHANGELOG: update changelog-3.4 with learner 2019-05-28 21:10:55 -07:00
23a89b0f09 Merge pull request #10755 from wrfly/patch-3
fix #10754
2019-05-28 20:50:52 -07:00
77e1c37787 Merge pull request #10730 from jingyih/learner_part3
*: support raft learner in etcd - part 3
2019-05-28 20:06:13 -07:00
3754767dbc clientv3: return getToken error when retryAuth 2019-05-29 10:15:22 +08:00
23511d21ec *: address comments 2019-05-28 18:50:13 -07:00
6bf609b96d integration: update TestMemberPromote test
Update TestMemberPromote to include both learner not-ready and learner
ready test cases.

Removed unit test TestPromoteMember, it requires underlying raft node to
be started and running. The member promote is covered by the integration
test.
2019-05-28 18:50:13 -07:00
3f94385fc6 etcdserver: update raftStatus 2019-05-28 18:50:13 -07:00
e994a4df01 etcdserver: check http StatusCode before unmarshal
Check http StatusCode. Only Unmarshal body if StatusCode is statusOK.
2019-05-28 18:50:13 -07:00
f8ad8ae4ad etcdserver: use etcdserver ErrLearnerNotReady
If learner is not ready to be promoted, use etcdserver.ErrLearnerNotReady
instead of using membership.ErrLearnerNotReady.
2019-05-28 18:50:13 -07:00
f5eaaaf440 etcdserver: forward member promote to leader 2019-05-28 18:50:10 -07:00
dfe296ac3c etcdserver: add mayPromote check 2019-05-28 18:47:03 -07:00
cca8b0d44f Doc: add learner in runtime-configuration.md 2019-05-28 18:47:03 -07:00
7a4d233bab clientv3/integration: better way to deflake test
Use ReadyNotify instead of time.Sleep to wait for server ready.
2019-05-28 18:47:03 -07:00
aa4cda2f5c etcdserver: allow 1 learner in cluster
Hard-coded the maximum number of learners to 1.
2019-05-28 18:47:03 -07:00
c438f6db27 etcdserver: check IsMemberExist before IsLearner
If member does not exist in cluster, IsLearner will panic.
2019-05-28 18:47:03 -07:00
d0c1b3fa38 etcdserver: learner return Unavailable for unsupported RPC
Make learner return code.Unavailable when the request is not supported
by learner. Client balancer will retry a different endpoint.
2019-05-28 18:47:03 -07:00
76a63f9f7d etcdserver: adjust StrictReconfigCheck
Adjust StrictReconfigCheck logic to accommodate learner members in the
cluster.
2019-05-28 18:47:03 -07:00
bdcecd1fc4 Merge pull request #10764 from jingyih/clarify_config_file_setting
*: more clarification on when server config file is provided
2019-05-28 16:23:19 -07:00
9c5426830b Merge pull request #10766 from gyuho/fix
*: reverting versioned import paths, use Go 1.12 for tests
2019-05-28 16:21:48 -07:00
9ecbf5d2d1 main_test: skip test when invoked via go test
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-28 16:19:11 -07:00
8ff5914404 tests: update semaphore upgrade tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-28 15:39:35 -07:00
34bd797e67 *: revert module import paths
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-28 15:39:35 -07:00
c8ffa36d9e scripts/genproto: bump up protoc 3.7.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-28 15:39:35 -07:00
05378f0d5d Makefile: upgrade default Go version
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-28 15:39:35 -07:00
75e440b105 build: fix import path
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-28 15:39:35 -07:00
986f16e032 CHANGELOG: remove import path change
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-28 15:39:35 -07:00
e3cdd3ae9c travis: fix GO111MODULE, import paths
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-28 15:39:30 -07:00
5e9c424f1f *: more clarification on server config file
Be more explicit in document and command line usage message that if a
config file is provided, other command line flags and environment
variables will be ignored.
2019-05-27 22:54:14 -07:00
a1ef44358d Merge pull request #10760 from jingyih/remove_you_word_from_doc
Doc: remove word you
2019-05-27 21:43:36 -07:00
11d3f74c33 Merge pull request #10744 from philips/drop-readthedocs
*: move to etcd.io for docs
2019-05-27 03:02:01 -07:00
e33c98f5ed Merge pull request #10763 from Inconnu08/master
clientv3: fix typo in test filename
2019-05-27 03:00:29 -07:00
2eaed14def clientv3: fix typo in test filename 2019-05-27 13:53:58 +06:00
14c5eaa7e1 etcdmain: improve readability
Improve readability of ETCD_CONFIG_FILE env variable parsing part
by adding comments and using flags.FlagToEnv function.

Signed-off-by: Andrey Abramov <st5pub@yandex.ru>
2019-05-26 09:54:26 +03:00
6955331901 etcdmain: fix ignoring of ETCD_CONFIG_FILE env variable
Fixes #10747

Signed-off-by: Andrey Abramov <st5pub@yandex.ru>
2019-05-25 23:54:19 +03:00
cf57fc837e Doc: remove word you
markdown_you test fails due to the appearance of word you in documents.
2019-05-24 16:06:10 -07:00
2ff2755528 Merge pull request #10750 from etcd-io/fanminshi-patch-1
readme: add etcd Emeritus section
2019-05-23 12:06:25 -07:00
a1a482a67d readme: add explanation 2019-05-22 20:58:17 -07:00
c38e965a65 Merge pull request #10683 from tbg/prs
raft: extract progress tracking into own component
2019-05-22 16:53:50 +02:00
4a42371447 readme: add etcd Emeritus section 2019-05-21 22:17:56 -07:00
416a5390c4 Merge pull request #10745 from etcd-io/fanminshi-patch-1
MAINTAINERS: remove fanmin shi
2019-05-21 22:12:06 -07:00
5dd45011d6 raft: rename prs to progressTracker 2019-05-21 16:03:36 +02:00
02b0d80234 raft: remove quorum() dependency from readOnly
This now delegates the quorum computation to r.prs, which will allow
it to generalize in a straightforward way when etcd-io/etcd#7625 is
addressed.
2019-05-21 16:03:36 +02:00
57a1b39fcd raft: avoid another call to quorum()
This particular caller just wanted to know whether it was in a single-voter
cluster configuration, which is now a question prs can answer.
2019-05-21 16:02:52 +02:00
bc828e939a raft: pull checkQuorumActive into prs
It's looking at each voter's Progress and needs to know how quorums
work, so this is the ideal new home for it.
2019-05-21 16:02:52 +02:00
a6f222e62d raft: establish an interface around vote counting
This cleans up the mechanical refactor in the last commit and will
help with etcd-io/etcd#7625 as well.
2019-05-21 16:02:52 +02:00
26eaadb1d1 raft: move votes into prs
This is purely mechanical. Cleanup deferred to the next commit.
2019-05-21 16:02:52 +02:00
a11563737c raft: use progress tracker APIs in more places
This doesn't completely eliminate access to prs.nodes, but that's not
really necessary. This commit uses the existing APIs in a few more
places where it's convenient, and also sprinkles some assertions.
2019-05-21 16:02:52 +02:00
ea82b2b758 raft: move more methods onto the progress tracker
Continues what was initiated in the last commit.
2019-05-21 16:02:52 +02:00
dbac67e7a8 raft: extract progress tracking into own component
The Progress maps contain both the active configuration and information
about the replication status. By pulling it into its own component, this
becomes easier to unit test and also clarifies the code, which will see
changes as etcd-io/etcd#7625 is addressed.

More functionality will move into `prs` in self-contained follow-up commits.
2019-05-21 16:02:52 +02:00
0cf6e1bcb8 MAINTAINERS: remove fanmin shi
I no longer have the time to maintain etcd as my career takes me to a different direction; Hence, I think it is appropriate to remove myself from the maintainer responsibility.

Working on etcd project was one of the most challenging and rewarding experiences I ever had. Thanks @xiang90 @gyuho @heyitsanthony @philips
2019-05-20 23:47:35 -07:00
c5e5240004 *: move to etcd.io for docs
Remove all readthedocs references for https://etcd.io to ensure the SEO
goes to the right place.
2019-05-20 14:32:08 -07:00
71881a423f Merge pull request #10724 from majolo/patch-1
Doc: Fix typo in revision.go
2019-05-16 11:06:52 -04:00
9a6f7d4361 README: update handling of security vulnerabilities
We may not want to suggest to contact CoreOS now. We could remove this
section but consiering the nature of the subject, discussion with the project
maintainers probably a good idea if someone doesn't find it comfortable
to report an issue right away.
2019-05-16 10:32:32 -04:00
9ab3572662 Doc: Fix typo in revision.go 2019-05-16 14:29:10 +01:00
d4cdbb1ea0 Merge pull request #10727 from jingyih/learner_part2
*: support raft learner in etcd - part 2
2019-05-15 16:41:08 -07:00
23f1d02391 *: address comments 2019-05-15 15:58:46 -07:00
90d28c0de7 clientv3/integration: deflake TestKVForLearner
Adding delay in the test for the newly started learner member to catch
up applying config change entries in raft log.
2019-05-15 13:58:28 -07:00
b23c8f3e8f clientv3/integration: fix cluster tests
Fixes TestMemberAddForLearner and TestMemberPromoteForLearner.
2019-05-15 13:58:26 -07:00
ac057951cc integration: remove unnecessary type conversion
Fixes go 'unconvert' test.
2019-05-15 13:48:54 -07:00
c836e37a83 etcdserver: remove unnecessary bool comparison
Fixes 'gosimple' test.
2019-05-15 13:48:54 -07:00
c55519b3a5 words: whitelist words to fix goword test. 2019-05-15 13:48:54 -07:00
a039f2efb8 clientv3, etcdctl: MemberPromote for learner 2019-05-15 13:48:52 -07:00
bd7f42855b integration: add TestTransferLeadershipWithLearner
Adding integration test TestTransferLeadershipWithLearner, which ensures
that TransferLeadership does not timeout due to learner is automatically
picked by leader as transferee.
2019-05-15 13:27:42 -07:00
e8dc4c5c25 integration: add TestMoveLeaderToLearnerError
Adding integration test TestMoveLeaderToLearnerError, which ensures that
leader transfer to learner member will fail.
2019-05-15 13:27:42 -07:00
44d935e90a etcdserver: exclude learner from leader transfer
1. Maintenance API MoveLeader() returns ErrBadLeaderTransferee if
transferee does not exist or is raft learner.

2. etcdserver TransferLeadership() only choose voting member as
transferee.
2019-05-15 13:27:42 -07:00
7f9479acc1 clientv3: add member promote 2019-05-15 13:27:42 -07:00
ba9fd620e8 etcdserver: support MemberPromote for learner 2019-05-15 13:27:42 -07:00
57a11eb1e1 integration: add TestKVForLearner
Adding TestKVForLearner. Also adding test utility functions for clientv3
integration tests.
2019-05-15 13:27:38 -07:00
43ed94f769 etcdserver: filter rpc request to learner
Hardcoded allowed rpc for learner node. Added filtering in grpc
interceptor to check if rpc is allowed for learner node.
2019-05-15 13:15:20 -07:00
355d0ab2a6 *: add learner field in endpoint status
Added learner field to endpoint status API.
2019-05-15 13:13:59 -07:00
42acdfcea7 Merge pull request #10668 from spzala/issuetriage
Doc: create issue and PR management guidelines
2019-05-15 10:30:36 -07:00
919b93b742 Merge pull request #10725 from jingyih/learner_part1
*: support raft learner in etcd - part 1
2019-05-14 20:35:48 -07:00
e1acf244c1 etcdserver: Add MemberAddAsLearner
Made changes to api/membership:

- Added MemberAddAsLearner
- Reverted changes to MemberAdd - removed input parameter isLearner
2019-05-14 18:18:10 -07:00
2b76200f70 *: add MemberAddAsLearner to clientv3 Cluster API
Made changes to Clientv3 Cluster API:

- Added MemberAddAsLearner.
- Reverted changes to MemberAdd - removed input parameter isLearner.
2019-05-14 16:56:44 -07:00
1e38de5b9d etcdctl: add learner field in member list output 2019-05-14 13:10:22 -07:00
e4296bbad9 tests/e2e: Add test for learner member add
Added an e2e test to exercise "etcdctl member add --learner".
2019-05-14 13:10:22 -07:00
a67d934410 etcdctl: support MemberAdd for learner
Added support for "etcdctl member add --learner".
2019-05-14 13:10:22 -07:00
fc14608cb7 clientv3: support MemberAdd for learner
Added IsLearner flag to clientv3 MemberAdd API.
2019-05-14 13:10:22 -07:00
604bc04f70 etcdserver: support MemberAdd for learner
Added IsLearner field to etcdserver internal Member type. Routed
learner MemberAdd request from server API to raft. Apply learner
MemberAdd result to server after the request is passed through Raft.
2019-05-14 13:10:22 -07:00
a0d3c4d641 *: fix compilation after API change
Fixed compilation erros after API change for learner.
2019-05-14 13:10:22 -07:00
7dc5451fae *: Change etcdserver API to support raft learner
- Added isLearner flag to MemberAddRequest in Cluster API.
- Added isLearner field to StatusResponse in Maintenance API.
- Added MemberPromote rpc to Cluster API.
2019-05-14 13:09:17 -07:00
a44a281ac3 CHANGELOG: remove mailing list reference
Recommendations for the production were bumped up recently. The related
ML email ref is old one so we should not provide the link.

Fixes #10669
2019-05-11 09:24:41 -04:00
d8c89021d7 Merge pull request #10689 from joshcc3/master
raft: cleanup wal directory if creation fails
2019-05-10 15:09:16 -07:00
a0c889d14b wal: add a test for wal cleanup, improve comments
To add test coverage of wal cleanup.
2019-05-10 22:36:26 +01:00
1411c585be etcdserver: fix typo in log message
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-05-10 09:54:00 -04:00
a73fb85c0c mvcc: fully concurrent read 2019-05-08 19:11:23 -07:00
886d30d223 Documentation: provide better user experience with autorefreshing grafana dashboard 2019-05-08 06:58:28 -04:00
f7f7e9c762 wal: Improve cleanup for robustness and debuggability
Rename wal with '.suffix.<timestamp>' instead of delete it and call cleanup when perr in a 'defer'ed statement.
2019-05-07 21:38:40 +01:00
51035bfd84 wal: cleanup wal directory if creation fails
delete <data-dir>/member/wal if any operation after the rename in
wal.Create fails to avoid reading an inconsistent WAL on restart.

Fixes #10688
2019-05-04 01:58:57 +01:00
39bbc66b46 Doc: create issue and PR management guidelines
I would like to propose a formal guide for issue triage and PR management.
This should help us keep open issues and PRs under a desirable numbers.
For example, keep issues under 100. These guidelines should specially help
manage and close issues and PRs that are inactive in a timely manner.
2019-05-03 17:03:17 -04:00
caee28a88e Merge pull request #10666 from mkumatag/fix_tests
Fix tests for latest golang
2019-05-03 11:18:58 -07:00
4d6ebafa54 Merge pull request #10704 from wilbeibi/master
raft: update raft paper link (previous link deprecated)
2019-05-03 09:34:02 -07:00
d68f60e9a0 raft: update raft paper link (previous link deprecated) 2019-05-03 08:50:16 -07:00
616592d9ba CHANGELOG: update
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-02 10:21:37 -07:00
e9f310af28 Merge pull request #10687 from rohitsardesai83/replace_ghodss_yaml_with_sigsk8sio_yaml
etcd: Replace yaml dependency `github.com/ghodss/yaml` with `sigs.k8s.io/yaml`
2019-05-02 09:47:33 -07:00
42a7ea6d33 etcd: Replace ghodss/yaml with sigs.k8s.io/yaml
To remove the dependency on ghodss/yaml. Replaced this dependency with sigs.k8s.io/yaml.
This wil help to remove the ghodss/yaml dependency from main kubernetes repository.

xref: https://github.com/kubernetes/kubernetes/issues/77024
2019-05-02 12:34:36 +05:30
e899023f3f Merge pull request #10640 from shrajfr12/gomodulecompat
Fix module path to have the major version to comply with go modules specification.
2019-05-01 22:46:03 -07:00
8a86a60fbc Merge pull request #10699 from jingyih/protect_tree_clone_with_write_lock
mvcc: protect tree clone with write lock
2019-05-01 13:25:31 -07:00
88922b0d08 mvcc: protect tree clone with write lock 2019-05-01 12:34:09 -07:00
fc6936863a Merge pull request #10582 from johncming/empty_update
clientv3/naming: ignore empty update.
2019-04-30 14:21:56 -07:00
1bd02b2053 Merge pull request #10595 from johncming/locking
clientv3: modify lock type.
2019-04-30 14:19:02 -07:00
e3f37534e1 Merge pull request #10684 from nvanbenschoten/nvanbenschoten/appendAndCopy
raft: Avoid multiple allocs when merging stable and unstable log
2019-04-30 11:51:32 -07:00
0bc219a91e Merge pull request #10679 from nvanbenschoten/nvanbenschoten/commitAlloc
raft: Avoid allocation when boxing slice in maybeCommit
2019-04-30 10:55:16 -07:00
efcc1088f0 Merge pull request #10680 from nvanbenschoten/nvanbenschoten/appendAlloc
raft: avoid allocation of Raft entry due to logging
2019-04-27 18:14:20 -07:00
41a0d67b30 Documentation: add links to blog post on benchmarking disks with fio
The documentation mentions fio as a tool to benchmark disks to assess
whether they are fast enough for etcd. But doing that is far from trivial,
because fio is very flexible and complex to use, and the user must make sure
that the workload fio generates mirrors the I/O workload of its etcd cluster
closely enough. This commit adds links to a blog post with an example of how
to do that.
2019-04-27 13:13:11 -04:00
9150bf52d6 go modules: Fix module path version to include version number 2019-04-26 15:29:50 -07:00
b5593de806 raft: Avoid multiple allocs when merging stable and unstable log
Appending to an empty slice twice could (and often did) result in
multiple allocations. This was wasteful. We can avoid this by performing
a single allocation with the correct size and copying into it.
2019-04-26 14:57:51 -04:00
24f35a9861 raft: avoid allocation of Raft entry due to logging
`raftpb.Entry.String` takes a pointer receiver, so calling it
on a loop variable was causing the variable to escape. Removing
the `.String()` call was enough to avoid the allocation, but
this also avoids a memory copy and prevents similar bugs.

This was responsible for 11.63% of total allocations in an
experiment with https://github.com/nvanbenschoten/raft-toy.
2019-04-26 14:56:31 -04:00
208b8a349c raft: Avoid allocation when boxing slice in maybeCommit
By boxing a heap-allocated slice header instead of the slice
header on the stack, we can avoid an allocation when passing
through the sort.Interface interface.

This was responsible for 26.61% of total allocations in an
experiment with https://github.com/nvanbenschoten/raft-toy.
2019-04-26 00:10:45 -04:00
cca0d5c1be Merge pull request #10672 from nolouch/fix-probing-log
api/rafthttp: fix the probing status log print
2019-04-24 23:05:23 -07:00
8146e1ebdf CHANGELOG-3.4: add json-iterator/go change
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-04-23 15:00:59 -07:00
1697c06df9 Merge pull request #10667 from dims/replace-ugorji-with-json-iterator
Replace ugorji/codec with json-iterator/go
2019-04-23 14:52:41 -07:00
3655a4b228 vendor: Run scripts/updatedeps.sh to cleanup unused code 2019-04-23 16:54:44 -04:00
daee668b75 client: Switch to case sensitive unmarshalling to be compatible with ugorji
Using lessons learned from k8s changes:
https://github.com/kubernetes/kubernetes/pull/65034

Change-Id: Ia17a8f94ae6ed00c5af2595c2b48d3c9a0344427
2019-04-23 16:54:44 -04:00
290ac75869 *: update bill-of-materials
Change-Id: Ibfa24e28cacd58388f7606a945c8ac35e1c34580
2019-04-23 16:54:44 -04:00
6de29e08aa vendor: Add json-iterator and its dependencies
Change-Id: I1f3fc00f95efadd6da9b4c248156f8460ae0ff97
2019-04-23 16:54:44 -04:00
86e3481ba2 scripts: Remove generated code and script
Change-Id: Iac4601443bcad71920fd96b97bfe21c16116577a
2019-04-23 16:54:44 -04:00
90108a2e61 client: Replace ugorji/codec with json-iterator/go
We need to use the stdlib-compatible one that is case-sensitive, etc

Change-Id: Id0df573a70e09967ac7d8c0a63d99d6a49ce82f1
2019-04-23 16:54:44 -04:00
9dfde8a4fb Merge pull request #10664 from jingyih/update_changelog_3p4_from_pr10646
CHANGELOG: update CHANGELOG-3.4 from #10646
2019-04-23 11:35:58 -07:00
decc0d5f43 api/rafthttp: fix the probing status print
Signed-off-by: nolouch <nolouch@gmail.com>
2019-04-23 19:51:34 +08:00
867b45d865 client: Fix tests for latest golang 2019-04-22 08:14:10 -05:00
216808eab5 Merge pull request #10663 from gyuho/log
etcdserver: improve heartbeat send failures logging
2019-04-22 13:37:40 +08:00
b11223caf5 CHANGELOG: update CHANGELOG-3.4 from #10646 2019-04-19 21:23:49 -07:00
943d6887c4 Merge pull request #10646 from yingnanzhang666/metrics_db_compaction
fix issue that metric db_compaction_total_duration_milliseconds is always 0
2019-04-19 21:10:41 -07:00
877f11bed8 etcdserver: improve heartbeat send failures logging
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-04-19 10:58:17 -07:00
85594ae99c Merge pull request #10661 from jpbetz/backport-changelog
CHANGELOG: update changelogs for backport of PR #10646
2019-04-18 14:10:17 -07:00
93479fdecd CHANGELOG: update changelogs for backport of PR #10646 2019-04-18 09:39:22 -07:00
cd7ffbe227 Merge pull request #10654 from CydeWeys/patch-1
embed: Fix HTTPs -> HTTPS in error message
2019-04-17 15:14:21 -04:00
b3dd3d3856 embed: Fix HTTPs -> HTTPS in error message 2019-04-17 09:38:53 -04:00
c5cb5509ea mvcc: fix db_compaction_total_duration_milliseconds 2019-04-16 10:32:21 +08:00
f29b1ada19 Merge pull request #10635 from jingyih/add_learner_field_to_progress_stringer
raft: add learner field to progress stringer
2019-04-11 19:19:13 -07:00
30034e5ff5 raft: add learner field to progress stringer 2019-04-11 18:15:03 -07:00
c7c6894527 Merge pull request #10551 from johncming/lbname
clientv3/balancer: change balancer name to builder name.
2019-04-10 16:16:51 -07:00
d69090002a Merge pull request #10604 from shreyas-s-rao/fix/open-wal-return-with-logger
wal: include logger in WAL returned by openAtIndex
2019-04-10 16:15:13 -07:00
9d62477c79 CHANGELOG: add "Change gRPC proxy to expose etcd server endpoint /metrics" PR
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-04-10 16:09:32 -04:00
9915d02022 *: Change gRPC proxy to expose etcd server endpoint /metrics
This PR resolves an issue where the `/metrics` endpoints exposed by the proxy were not returning metrics of the etcd members servers but of the proxy itself.

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-04-10 16:09:32 -04:00
f17c038fc4 etcdserver: api/v2v3: add initial tests
Add a bunch of tests on fundamental operation of Create/Set/Delete.

Context: I want to use this package to run the discovery service against
the v3 storage backend. The limited functionality already implemented
should be sufficient to do this.
https://github.com/coreos/discovery.etcd.io/issues/52
2019-04-10 19:17:25 +00:00
cc08c1bd2e clientv3/integration/leasing_test.go: Fix t.Fatalf error message
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-04-10 13:19:50 -04:00
914e5edb00 wal: include logger in WAL returned by openAtIndex
Signed-off-by: Shreyas Rao <shreyas.sriganesh.rao@sap.com>
2019-04-02 13:09:10 +05:30
a621d807f0 documentation: initial metadata additions for website generation (#10596)
Signed-off-by: lucperkins <lucperkins@gmail.com>
2019-04-01 13:57:24 -07:00
9b5c468dc6 clientv3: modify lock type. 2019-03-29 21:02:13 +08:00
be39aa5bb2 Documentation: add Python client for etcd v3 2019-03-29 06:09:38 -04:00
952b9e75c6 Merge pull request #10590 from jingyih/fix_readIndex_for_learner
raft: leader respond to learner read index message
2019-03-28 16:51:35 -07:00
5088d70d69 raft: leader response to learner MsgReadIndex
Leader should check message sender after receiving MsgReadIndex, even
when raft quorum is 1. The message could be sent from learner node, and
leader should respond.
2019-03-28 16:14:32 -07:00
a645e27486 Merge pull request #10591 from purpleidea/bug/fatal-corruption
etcdserver: Use panic instead of fatal on no space left error
2019-03-28 15:19:33 -07:00
d5e94b1c0d Merge pull request #10557 from johncming/add_test
contrib/raftexample: add test for httpKVAPI.
2019-03-28 11:39:45 -07:00
368f70a37c etcdserver: Use panic instead of fatal on no space left error
When using the embed package to embed etcd, sometimes the storage prefix
being used might be full. In this case, this code path triggers, causing
an: `etcdserver: create wal error: no space left on device` error, which
causes a fatal. A fatal differs from a panic in that it also calls
os.Exit(1). In this situation, the calling program that embeds the etcd
server will be abruptly killed, which prevents it from cleaning up
safely, and giving a proper error message. Depending on what the calling
program is, this can cause corruption and data loss.

This patch switches the fatal to a panic. Ideally this would be a
regular error which would get propagated upwards to the StartEtcd
command, but in the meantime at least this can be caught with recover().

This fixes the most common fatal that I've experienced, but there are
surely more that need looking into. If possible, the errors should be
threaded down into the code path so that embedding etcd can be more
robust.

Fixes: https://github.com/etcd-io/etcd/issues/10588
2019-03-27 15:24:33 -04:00
9c2b88d783 Merge pull request #10583 from johncming/correct_err
etcdmain: use same error.
2019-03-26 15:03:38 -04:00
56f1bce161 raft/doc: clarify the case of out of date term
Clarify the doc.

Fixes #10491
2019-03-26 14:00:24 -04:00
11272ed320 etcdmain: use same error. 2019-03-26 11:21:47 +08:00
cb39c97b22 clientv3/naming: ignore empty update. 2019-03-25 10:53:24 +08:00
7a5acb4a43 Merge pull request #10574 from johncming/rename_err
raft: more precise that rename res to err.
2019-03-22 11:35:51 -07:00
92d5d19ce9 raft: more precise that rename res to err. 2019-03-22 10:18:00 +08:00
41f7142ff9 doc: fix document example error 2019-03-21 08:21:03 -04:00
122744c660 Documentation: update force-new-cluster flag usage for v3
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-03-20 18:06:42 -04:00
77d4b742cd Merge pull request #10562 from johncming/naked_return
contrib/raftexample: fix naked return.
2019-03-19 21:43:26 -07:00
affaa36190 bill-of-materials: fix test failure
Re-generated bill-of-materials.json.
2019-03-19 22:07:00 -04:00
4452d4be22 contrib/raftexample: fix naked return. 2019-03-20 09:52:12 +08:00
09d0844379 Merge pull request #10548 from jingyih/add_TestMemberAddWithExistingURLs
clientv3/integration: Add TestMemberAddWithExistingURLs
2019-03-19 15:23:15 -07:00
2d9b32dc3d Merge pull request #10542 from johncming/usecancel
integration: use cancel instead of close.
2019-03-19 14:19:32 -07:00
ec1cbce10e Merge pull request #10541 from johncming/embed-comment
embed: Modify the comments to be more precise.
2019-03-19 14:18:44 -07:00
6df40e1c70 Merge pull request #10550 from johncming/unused
clientv3: clean up unused code.
2019-03-19 14:17:41 -07:00
b1e8218072 Merge pull request #10553 from damnever/damnever-patch-1
doc: fix `member add` usage
2019-03-18 20:47:55 -07:00
51cdbb6d1a contrib/raftexample: add test for struct httpKVAPI. 2019-03-19 11:12:47 +08:00
9bd86a647f clientv3: Add TestMemberAddWithExistingURLs
TestMemberAddWithExistingURLs ensures adding a new member with URLs
already being used in the cluster will not succeed.
2019-03-18 12:48:44 -07:00
1d764511f6 doc: fix member add usage 2019-03-18 14:47:41 +08:00
ddff08ffad clientv3/balancer: change balancer name to builder name. 2019-03-18 11:05:22 +08:00
662fd55084 clientv3: clean up unused code. 2019-03-18 10:34:41 +08:00
97509833e2 integration: use cancel instead of close. 2019-03-14 11:14:43 +08:00
874532c2da embed: Modify the comments to be more precise. 2019-03-14 10:59:52 +08:00
e1ca3b4434 Merge pull request #10531 from JoeWrightss/patch-3
Fix some variable spelling errors
2019-03-12 12:09:09 -07:00
4478993fbc Merge pull request #10516 from shreyas-s-rao/wal-verify-func
wal: add Verify function to perform corruption check on wal contents
2019-03-12 12:06:36 -07:00
bb3eb8fea9 wal: Add test for Verify
Signed-off-by: Shreyas Rao <shreyas.sriganesh.rao@sap.com>
2019-03-12 22:25:25 +05:30
3d6862fe0d wal: add Verify function to perform corruption check on wal contents
Signed-off-by: Shreyas Rao <shreyas.sriganesh.rao@sap.com>
2019-03-12 22:25:25 +05:30
dc50416157 Merge pull request #10534 from johncming/close_reader
close http request body after read it.
2019-03-12 08:25:32 -07:00
e46af034df bugfix: adjust or add close request body.
affected modules:
- lease/leasehttp
- contrib/raftexample
2019-03-12 22:41:55 +08:00
b7ad8c6741 Merge pull request #10535 from johncming/fix_close_pos
etcdserver/api/rafthttp: fix the location of close http body.
2019-03-11 09:17:13 -07:00
bd41f74168 etcdserver/api/rafthttp: fix the location of close http body. 2019-03-11 22:20:38 +08:00
939b4f8599 clientv3/balancer/grpc1.7-health.go: Fix variable spelling error
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-03-10 01:11:02 +08:00
6da17cda18 Merge pull request #10515 from Prototik/fix-sd-notify
etcdmain: fix sd_notify for restricted environments
2019-03-08 15:34:47 -08:00
4dc9d8b058 Merge pull request #10503 from spzala/v3argorder10431
clientV3: fix behavior of WithPrefix and WithFromKey functions
2019-03-08 11:59:17 -08:00
e80d1745be Merge pull request #10420 from spzala/watch10340
clientV3watch: do not return ctx canceled when Close watch
2019-03-08 11:57:44 -08:00
81b71da66d Merge pull request #10527 from etcd-io/gyuho-patch-1
docs: update PyYAML to 4.2b1
2019-03-07 11:17:20 -08:00
5ed26c7c48 docs: update PyYAML to 4.2b1
https://github.com/etcd-io/etcd/network/alert/docs/requirements.txt/pyyaml/open
2019-03-07 11:14:10 -08:00
949bcbddbe Merge pull request #10526 from jingyih/improve_mvcc_index_concurrency
mvcc: release lock early when traversing index
2019-03-06 15:48:48 -08:00
93732df3ef mvcc: release lock early when traversing index
Make a copy-on-write clone of index tree when traversing. So that lock
can be released right after the clone to improve backend concurrency.
2019-03-06 14:26:17 -08:00
4b69cfc56b Merge pull request #10492 from nolouch/fix-lessor
lease: fix deadlock with Renew lease when the checkpointor is set
2019-03-06 10:23:57 -08:00
b08e6db0e8 Merge pull request #10506 from jingyih/improve_etcd_backend_readability
mvcc/backend: rename Lock() to RLock() in ReadTx interface
2019-03-05 14:54:04 -08:00
1c19f126cb mvcc/backend: rename ReadTx Lock() to RLock()
For better code readability, renaming Lock() to RLock() in ReadTx
interface.
2019-03-05 13:53:27 -08:00
fbf732d3dc etcdmain: fix sd_notify for restricted environments
Remove call to dumb IsRunningSystemd() as it doesn't check anything
2019-03-02 23:44:39 +07:00
2c69559819 Merge pull request #10513 from jutley/txn-newline-documentation
etcdctl: Update README to clarify newline syntax in TXN
2019-03-02 15:23:31 +08:00
a532b60c7e etcdctl: Update README to clarify newline syntax in TXN
Add documentation to clarify that when writing TXN commands, multi-line values should be written using "\n" and not a literal newline (as in other commands).

Fixes #10169
2019-03-01 14:04:48 -08:00
b25edb62cc clientV3watch: Watch Close should close successfully
Closing of watch by client will cancel the watch grpc stream and
can produce a context canceled error. However, since client
simply wanted to close the watcher the error can create confusion
that something went wrong instead of a successful close. Ensure
that Close do not return error.

Fixed #10340
2019-02-28 20:43:20 -05:00
a943ad0ee4 client/keys_bench_test.go: Fix some misspells
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-02-28 14:36:06 -05:00
5694f3e4f5 Merge pull request #10481 from hnlq715/master
client: upgrade github.com/ugorji/go to v1.1.2
2019-02-28 21:15:33 +08:00
8782bbae65 clientV3: fix behavior of WithPrefix and WithFromKey functions
The use of WithPrefix() and WithFromKey() together is not supported. The
client doesn't respect this behavior currently.

Fixes #10431
2019-02-27 09:23:01 -05:00
17de9bd526 Merge pull request #10501 from datuanmac/replacing_http_by_https
doc: Replacing 'HTTP' by 'HTTPS' for securing links
2019-02-27 14:08:24 +08:00
32389b1876 doc: Replacing 'HTTP' by 'HTTPS' for securing links
Currently, when we access the URL https://www.grpc.io, it is redirected to https://www.grpc.io automatically.
So this commit aims to replace **HTTP** to **HTTPs** for security.
2019-02-27 10:15:22 +07:00
918f0414dd Merge pull request #10495 from gyuho/zap-logger
*: define default zap log configuration
2019-02-26 14:18:24 -08:00
a7e3bd06b2 Merge pull request #10496 from trung/raft-example
raftexample: added build instruction  and minor refactoring
2019-02-26 20:45:24 +08:00
6543273666 client: generate new keys and remove yynn2 = 0 2019-02-25 12:07:40 +08:00
1adc288223 vendor: update ugorji to v1.1.2 2019-02-25 12:05:35 +08:00
5effa154b4 auth/simple_token.go: fix plog.Panicf error message
Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>
2019-02-24 19:34:02 -05:00
e20b9d9e16 lease: fix deadlock with renew lease when the checkpointor is set
Signed-off-by: nolouch <nolouch@gmail.com>
2019-02-24 19:53:09 +08:00
5b4ff6c6d5 raftexample: update readme 2019-02-21 16:56:52 -05:00
74d6df2bd6 raftexample: added build instruction and minor refactoring 2019-02-21 15:00:38 -05:00
dca0dec382 integration: use default log configuration
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-02-21 11:00:01 -08:00
4f46b65748 clientv3: use default info level log configuration
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-02-21 10:57:38 -08:00
8d1a62e7ef *: use default log configuration for server
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-02-21 10:57:26 -08:00
52391e3be7 pkg/logutil: define default zap.Config
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-02-21 10:56:53 -08:00
8c228d692b Merge pull request #10473 from haroldHT/master
benchmark: fix install docs
2019-02-19 13:27:55 -08:00
e77069fd2e Merge pull request #10486 from gyuho/vendor
vendor: fix vendor go mod
2019-02-19 09:29:05 -08:00
a151edc8c2 vendor: fix vendor go mod
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-02-19 08:48:25 -08:00
bf9444b32d Merge pull request #10378 from johncming/net-partition
clientv3/integration: add timeout case.
2019-02-19 18:36:44 +08:00
a1fb18a9fc benchmark: fix install docs 2019-02-18 18:15:40 +08:00
94b782e7c9 clientv3/integration: add timeout case. 2019-02-18 09:38:49 +08:00
784daa0498 Merge pull request #10476 from gyuho/client-log
clientv3: clarify retry interceptor logging
2019-02-15 10:17:05 -08:00
5877763990 tests/e2e: fix "authLeaseTestLeaseRevoke"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-02-15 09:39:03 -08:00
6af8ce6c60 clientv3: clarify retry interceptor logging
Log only when it errors + some clarification

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-02-15 08:48:37 -08:00
4cd0bf8ea8 Merge pull request #10444 from WIZARD-CXY/nnboltdb
*: add flag to let etcd use the new boltdb freelistType feature
2019-02-14 13:16:56 +08:00
e6c6d8492e *: add flag to let etcd use the new boltdb freelistType feature 2019-02-14 11:07:08 +08:00
24fc2a983a Merge pull request #10465 from hpandeycodeit/docfix_10462
Fixed --strict-reconfig-check#10462
2019-02-12 08:16:53 +08:00
6757a568e0 Documentation: Fixed --strict-reconfig-check#10462 2019-02-11 14:39:11 -08:00
3546c4868c Merge pull request #10445 from spzala/fromkey9833
clientv3: fix WithFromKey
2019-02-07 14:50:13 -08:00
deff5588ff Merge pull request #10457 from mkumatag/limit_ciphersuite
pkg/transport: Limit InvalidCipherSuites to TLS12
2019-02-07 13:15:32 -08:00
0f58292ca5 CHANGELOG-3.3: fix typo in Go version
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-02-07 12:07:59 -08:00
68835bddd0 CHANGELOG-3.3: add v3.3.12
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-02-07 12:07:20 -08:00
30018dbf52 Merge pull request #10456 from jpbetz/changelog-10443
Add #10443 fix to changelogs
2019-02-07 10:41:15 -08:00
715510a5d2 Merge pull request #10446 from povilasv/etcd-mixin-fix-msg
etcd-mixin: Improve etcdHighNumberOfLeaderChanges,etcdHighNumberOfFailedProposals messages
2019-02-07 10:25:03 -08:00
0418488666 Merge pull request #10454 from mkumatag/fix_shadow
Add shadow tool
2019-02-07 10:23:50 -08:00
45d09f0508 pkg/transport: Limit InvalidCipherSuites to TLS12 2019-02-07 08:18:00 -06:00
474cea1cd6 test: Add shadow tool 2019-02-06 23:20:06 -06:00
faa7a49972 Merge pull request #10443 from Quasilyte/quasilyte/fix_args_order
etcdctl: fix strings.HasPrefix args order
2019-02-06 10:54:11 -08:00
eb8e94c4ed etcd-mixin: Improve etcdHighNumberOfLeaderChanges,etcdHighNumberOfFailedProposals message
Currently alert messages state that we detect issue
within the last 1 hour, although we check
for last 15min and wait for 15min for this alert to keep firing.
This fix changes the message to be 30minutes.
2019-02-04 09:28:23 +02:00
313ab0ba47 clientv3: fix WithFromKey
The WithFromKey func should not return error similar to etcdctl usage
of it when an empty key is provided.

Fixed #9833
2019-02-02 19:21:49 -05:00
1fe6f109c8 Merge pull request #10443 from Quasilyte/quasilyte/fix_args_order
etcdctl: fix strings.HasPrefix args order
2019-02-02 13:41:40 -08:00
be40b1d646 Merge pull request #10428 from cfc4n/master
clientv3/integration: Don't retry other endpoints when err == rpctypes.ErrAuthNotEnable
2019-02-02 13:37:56 -08:00
a033686acf clientv3/integration: return err if err == rpctypes.ErrAuthNotEnable 2019-02-02 14:06:54 +08:00
48a2442fd7 etcdctl: fix strings.HasPrefix args order
Signed-off-by: Iskander Sharipov <quasilyte@gmail.com>
2019-02-02 02:39:27 +03:00
3e0f0ba40e Merge pull request #10401 from markmc/doc-drop-etcdctl-v3-flag
Eliminate some ETCDCTL_API=3 usage
2019-02-01 11:25:55 -08:00
6070db22ed Merge pull request #10424 from hexfusion/fx_genproto
*: bump protoc to 3.6.1 and fix genproto.sh
2019-01-30 12:53:39 -05:00
46e23b233c vendor: update boltdb and grpc middleware version 2019-01-30 06:21:57 -05:00
329be66e8b Merge pull request #10343 from mitake/proxy-cn
*: let grpcproxy rise an error when its cert has non empty CN
2019-01-26 01:53:12 +09:00
a1f964afd3 tests: add a new e2e test case for the combination of non empty CN and grpc proxy 2019-01-25 00:43:57 +09:00
b1afe210e4 Documentation: describe the problem of CN based auth + grpcproxy 2019-01-25 00:43:57 +09:00
65887ae1b4 pkg, clientv3, etcdmain: let grpcproxy rise an error when its cert has non empty CN
Fix https://github.com/etcd-io/etcd/issues/9521
2019-01-25 00:43:57 +09:00
fa521f4e00 Merge pull request #10392 from mitake/cn-gateway
*: grpc gateway and CN based auth
2019-01-24 09:08:04 +09:00
de8e29e71c Merge pull request #10423 from markmc/prober-http-status
prober: check response http status code
2019-01-22 11:19:15 -08:00
69e2faec00 tests: update TestV3CurlAuthClientTLSCertAuth for using cert with empty CN 2019-01-23 03:26:34 +09:00
11fb62ecb4 embed: requests for grpc gateway must have empty CN if --client-cert-auth is passed
This commit lets grpc gateway return a correct error to clients.

Even if a client has a cert with non empty CN, current gateway returns
an error like below:
```
$ curl --cacert ./integration/fixtures/ca.crt --cert ./integration/fixtures/server.crt --key ./integration/fixtures/server.key.insecure https://localhost:2379/v3/kv/put -X POST -d '{"key": "fromcurl", "value": "test"}'
{"error":"etcdserver: user name is empty","code":3}
```
This is because etcd ignores CN from gateway connection.

The error will be like this:
```
$ curl --cacert ./integration/fixtures/ca.crt --cert ./integration/fixtures/server.crt --key ./integration/fixtures/server.key.insecure https://localhost:2379/v3/kv/put -X POST -d '{"key": "fromcurl", "value": "test"}'
CommonName of client sending a request against gateway will be ignored and not used as expected
```

The error will be returned if the server is enabling auth and gRPC
gateway.
2019-01-23 03:26:34 +09:00
72dd4a18c5 *: add a new option --enable-grpc-gateway for enabling/disabling grpc gateway 2019-01-23 03:26:34 +09:00
cbdb36295e Documentation: regenerate proto
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-22 16:57:07 +00:00
a011b2c4c4 scripts: disable go mod and bump protoc to 3.6.1
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-22 16:55:01 +00:00
627660e94e prober: check response http status code
Updated vendored probing module to 0.0.2.

Fixes #10404
2019-01-22 16:21:23 +00:00
ea0cf681c7 OWNERS: add hexfusion as approver and remove joelegasse
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-21 10:52:34 -05:00
25068dfc1e Merge pull request #10244 from paskal/master
Sync prometheus alerting rules with prometheus-operator version
2019-01-20 21:07:32 -08:00
2a1f271f91 Merge pull request #10419 from WIZARD-CXY/fixdeadlock
bugfix:dead lock on store.mu when store.Compact in store.Restore happens
2019-01-20 19:29:22 -08:00
6e8913b004 bugfix:dead lock on store.mu when store.Compact in store.Restore happens 2019-01-21 10:46:58 +08:00
69ed707fab CONTRIBUTING: clarify commit message style
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-18 13:00:24 -05:00
fcc29894c2 config: multiple logging fixes
First, don't panic with invalid --log-outputs. For example:

  $> ./bin/etcd --log-outputs foo
  2018-12-20 15:05:47.988652 C | embed: unknown log-output "foo" (only supports "default", "stderr", "stdout")
  panic: unknown log-output "foo" (only supports "default", "stderr", "stdout")

  goroutine 1 [running]:
  go.etcd.io/etcd/vendor/github.com/coreos/pkg/capnslog.(*PackageLogger).Panicf(0xc000294b00, 0x10fe067, 0x30, 0xc0001fa398, 0x4, 0x4)
        go.etcd.io/etcd/vendor/github.com/coreos/pkg/capnslog/pkg_logger.go:75 +0x161
  go.etcd.io/etcd/embed.(*Config).setupLogging(0xc000291400, 0xc0002a85b0, 0x1)
        go.etcd.io/etcd/embed/config_logging.go:120 +0x1939
  ...

Or:

 $> ./bin/etcd --log-outputs foo,default --logger zap
 panic: multi logoutput for "default" is not supported yet

 goroutine 1 [running]:
 go.etcd.io/etcd/embed.(*Config).setupLogging(0xc000314500, 0xc0001b2f70, 0x1)
        go.etcd.io/etcd/embed/config_logging.go:129 +0x2437
 go.etcd.io/etcd/embed.(*Config).Validate(0xc000314500, 0xc000268a98, 0x127e440)
        go.etcd.io/etcd/embed/config.go:543 +0x43

Second, don't exit in embed.setupLogging(). Before:

  $> ./bin/etcd --log-outputs foo,bar
  --logger=capnslog supports only 1 value in '--log-outputs', got ["bar" "foo"]

and after:

  $> ./bin/etcd --log-outputs foo,bar
  2018-12-20 15:10:24.317982 E | etcdmain: error verifying flags, --logger=capnslog supports only 1 value in '--log-outputs', got ["bar" "foo"]. See 'etcd --help'.

Third, remove duplicated unique strings code. UniqueStringsFromFlag()
is already available to return a sorted slice of values, so just use
that.

Lastly, fix a tiny logging typo in config.
2019-01-17 15:09:26 -05:00
cbfe0b4b79 Merge pull request #10409 from nolouch/add-logger
embed: add zap logger builder
2019-01-17 11:21:52 -08:00
a00bff7848 Merge pull request #10402 from markmc/interactive-watch-panic
etcdctl: fix interactive mode panic
2019-01-16 11:40:34 +08:00
ac090fe326 embed: add zap logger builder
Signed-off-by: nolouch <nolouch@gmail.com>
2019-01-15 23:22:04 +08:00
e53324db3b scripts/release: stop using ETCDCTL_API=3
Note: v3 has been the default since 25bc65794.
2019-01-14 14:46:16 +00:00
4d45a9ca43 build: stop using ETCDCTL_API=3
Note: v3 has been the default since 25bc65794.
2019-01-14 14:46:16 +00:00
0427f46f17 doc: don't set ETCDCTL_API=3 in local_cluster guide
These docs are incorrectly saying that v2 is the default.

Note: v3 has been the default since 25bc65794.
2019-01-14 14:46:16 +00:00
034312eac5 doc: fix note that says ETCDCTL_API=2 is the default
Note: v3 has been the default since 25bc65794.
2019-01-14 14:46:03 +00:00
36d7acf330 etcdctl: fix interactive mode panic
Don't panic if command is given in interactive mode, give a nice error
message instead.

Before:

 $ ./bin/etcdctl watch -i
 <hit return>
 panic: runtime error: index out of range

 goroutine 1 [running]:
 etcdctl/ctlv3/command.watchInteractiveFunc(...)
 	etcd/etcdctl/ctlv3/command/watch_command.go:104 ...

After:

 $ ./bin/etcdctl watch -i
 <hit return>
 Invalid command:  (watch and progress supported)
 foo
 Invalid command foo (only support watch)
2019-01-14 13:01:37 +00:00
071a0157e0 etcdctl: fix README to not suggest v2 is default
Note: v3 has been the default since 25bc65794.
2019-01-14 12:38:37 +00:00
39ef3901ef README: stop using ETCDCTL_API=3
Note: v3 has been the default since 25bc65794.
2019-01-14 12:38:37 +00:00
b398947cf9 doc: don't use ETCDCTL_API=3 in dl_build
In the spirit of keeping newb instructions simple, do not specify
ETCDCTL_API=3 since it is redundant.

Note: v3 has been the default since 25bc65794.
2019-01-14 12:38:24 +00:00
1eee465a43 CHANGELOG: revert discovery-srv-name feature from 3.3.11
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-11 13:12:16 -05:00
a26fa9fe1f CHANGELOG: add "disable CommonName authentication for gRPC-gateway" PR
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-11 13:00:14 -05:00
1eec48083b CHANGELOG: bump version and release date
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-11 12:59:19 -05:00
fae6e92407 Merge pull request #10390 from johncming/missing-err
tests/e2e: add missing return error.
2019-01-09 14:41:48 -08:00
2063b358c8 Merge pull request #10218 from mailgun/maxim/develop
Remove infinite loop in doSerialize
2019-01-09 10:38:25 -08:00
fffb982f1a tests/e2e: add missing return error. 2019-01-09 13:47:09 +08:00
1e42503bea Merge pull request #10379 from johncming/app-resp
etcdserver: add a test to verify number of MsgAppResp sent is correct.
2019-01-08 18:55:30 -08:00
e8f46ce341 etcdserver: add a test to verify not to send duplicated append responses 2019-01-09 10:37:43 +08:00
577d7c0df2 e2e: update test to reflect (ST1005) update.
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-08 21:04:20 -05:00
a82703b69e *: error strings should not end with punctuation or a newline (ST1005)
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-08 21:04:20 -05:00
6511829d1f Merge pull request #10374 from johncming/deprecated
api/rafthttp: remove deprecated req.Cancel.
2019-01-08 14:33:25 -08:00
1e15c7434e vendor: cleanup and revendor deps
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-08 16:49:51 -05:00
2001786f02 *: Use -n instead of ! -z. [SC2236]
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-08 13:45:40 -05:00
442c863413 Merge pull request #10377 from johncming/cancel-pos
api/v2auth: remove defer in loop.
2019-01-08 09:43:06 -08:00
21e0d3e527 Merge pull request #10359 from rkday/install-instructions
docs: install etcdctl with `go get` as well
2019-01-08 09:42:11 -08:00
83c051b701 CHANGELOG: add "disable CommonName authentication for gRPC-gateway" PR
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-08 12:31:20 -05:00
99704e2a97 e2e: add ClientTLSCertAuth coverage for curl v3 auth tests
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-08 12:31:20 -05:00
a9a9466fb8 Documentation: document gRPC-gateway CN authentication support
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-08 12:31:20 -05:00
bf9d0d8291 auth: disable CommonName auth for gRPC-gateway
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-08 12:31:20 -05:00
9c6b407e7d Documentation: add missing ENV 2019-01-08 11:36:07 -05:00
b04633fd8e Merge pull request #10375 from johncming/redundant-parentheses
etcdserver: remove redundant parentheses.
2019-01-07 18:38:26 -08:00
e96dbfb973 api/v2auth: remove defer in loop. 2019-01-08 08:56:55 +08:00
8945fecf85 Merge pull request #10376 from johncming/snake-case
api/v2store:  use camel case instead of snake case.
2019-01-07 10:19:32 -08:00
5060560f92 api/v2store: use camel case instead of snake case. 2019-01-07 10:35:23 +08:00
802e2aaadd etcdserver: remove redundant parentheses. 2019-01-07 10:27:52 +08:00
4651f49a5c api/rafthttp: remove deprecated req.Cancel. 2019-01-07 10:12:47 +08:00
f0aeb705ce Merge pull request #10371 from johncming/no-assign
etcdserver: add missing lg assignment.
2019-01-04 19:04:57 -08:00
b2e0e760a0 etcdserver: add missing lg assignment. 2019-01-05 09:24:48 +08:00
fde617d2dc Merge pull request #10368 from johncming/blankline
pkg/testutil: add blankline between two functions
2019-01-03 20:58:07 -08:00
8f383852e2 pkg/testutil: add blankline between two functions 2019-01-04 09:44:57 +08:00
daec071813 Merge pull request #10367 from gyuho/remove-backoff-utils
clientv3: remove "JitterUp" imports
2019-01-03 14:44:18 -08:00
61218004c0 clientv3: remove "JitterUp" imports
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-01-03 11:00:45 -08:00
3a5868a7b9 Merge pull request #10363 from johncming/embed
embed: add test cases in TestAutoCompactionModeParse.
2019-01-03 10:49:51 -08:00
e7b5f2de26 embed: add test cases in TestAutoCompactionModeParse. 2019-01-02 17:08:15 +08:00
400b568fd6 Merge pull request #10360 from yaojingguo/ignore-file
gitignore: ignore build result and runtime files of raftexample
2019-01-01 09:31:38 -08:00
e86a12bad0 gitignore: ignore build result and runtime files 2019-01-01 19:14:48 +08:00
959d004bcc docs: install etcdctl with go get as well
The documentation recommends using `etcdctl` to verify that etcd has
been built correctly, but the `go get` command provided does not install
it. This changes it so that it does, and also clarifies terminology
('using `go get`', referring to the command used to install it, rather
than 'using `$GOPATH`').
2018-12-30 12:10:32 +00:00
cc8d446a6e Merge pull request #8334 from lishuai87/shawnsli/reset-leader-when-become-pre-candidate
raft: introduce/fix TestPreVoteWithCheckQuorum
2018-12-28 09:51:06 -08:00
23731bf9ba raft: set lead to none when becomePreCandidate 2018-12-28 19:57:26 +08:00
6937b77232 README: remove "you" in "Community Meeting"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-12-23 21:53:36 -08:00
deeb16c9e8 Merge pull request #10346 from lsytj0413/fix-golint
refactor(*): fix golint warning
2018-12-23 20:26:09 -08:00
792aad932f refactor(*): fix golint warning 2018-12-24 11:43:10 +08:00
9113019936 Update README.md with community meeting info 2018-12-21 15:53:59 -08:00
f3fbedc88f client: update generated ugorji codec, manual remove "yynn2=0"
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2018-12-21 18:16:55 -05:00
a580ec4547 CHANGELOG: add patch release, gRPC proxy fix
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-12-17 21:22:37 -08:00
7c7141550d docs/README: highlight new documentation website 2018-12-17 18:15:26 -08:00
1bbe6bd7d3 Merge pull request #10332 from gyuho/update
*: update Go versions
2018-12-17 15:19:49 -08:00
c58f5cfeda test: disable "unparam" for now
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-12-17 11:30:28 -08:00
0929080834 doc: exclude 404 error because kubelet generating false positive 2018-12-17 11:57:12 +03:00
830d064903 doc: convert etcd to lower-case everywhere 2018-12-17 11:57:12 +03:00
358cc1a8fa doc: sync prometheus rules with prometheus-operator version
(and remove non-etcd specific FdExhaustionClose)
https://github.com/coreos/prometheus-operator/blob/master/helm/exporter-kube-etcd/templates/etcd3.rules.yaml
sync etcd alert rules with libsonnet

Signed-off-by: Dmitry Verkhoturov <paskal.07@gmail.com>
2018-12-17 11:57:12 +03:00
6f0ba5fa06 *: update Go versions
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-12-16 23:13:11 -08:00
abb57363ba Merge pull request #10327 from namreg/fix-grpcproxy-memory-leak
grpcproxy: fix memory leak
2018-12-16 17:32:27 -08:00
d88f686a91 grpcproxy: fix memory leak
use set instead of slice as interval value

fixes #10326
2018-12-15 17:00:51 +03:00
e57f4f420d docs/requirements: update to urllib3>=1.23 2018-12-12 08:57:45 -08:00
15b6a17be4 Merge pull request #10313 from lsytj0413/fix-getlogger
refactor(*): remove duplicate GetLogger
2018-12-10 06:48:02 -08:00
23862b5d64 refactor(*): remove duplicate GetLogger 2018-12-10 17:58:47 +08:00
1900a8e26f Merge pull request #10308 from tbg/fix/progress-after-snap
raft: enter ProgressStateReplica immediately after snapshot
2018-12-06 10:58:22 -08:00
8a9a2a1a5a Merge pull request #10309 from tbg/fix/raft-status-allocs
raft: add (*RawNode).WithProgress
2018-12-06 10:56:40 -08:00
bd332b318e raft: add (*RawNode).WithProgress
Calls to Status can be frequent and currently incur three heap
allocations, but often the caller has no intention to hold on to the
returned status.

Add StatusWithoutProgress and WithProgress to allow avoiding heap
allocations altogether. StatusWithoutProgress does what's on the
tin and additionally returns a value (instead of a pointer) to
avoid the associated heap allocation. By not returning a Progress
map, it avoids all other allocations that Status incurs.

To still introspect the Progress map, add WithProgress, which
uses a simple visitor pattern.

Add benchmarks to verify that this is indeed allocation free.

```
BenchmarkStatusProgress/members=1/Status-8                  5000000    353 ns/op        784 B/op    3 allocs/op
BenchmarkStatusProgress/members=1/Status-example-8          5000000    372 ns/op        784 B/op    3 allocs/op
BenchmarkStatusProgress/members=1/StatusWithoutProgress-8   100000000  17.6 ns/op         0 B/op    0 allocs/op
BenchmarkStatusProgress/members=1/WithProgress-8            30000000   48.6 ns/op         0 B/op    0 allocs/op
BenchmarkStatusProgress/members=1/WithProgress-example-8    30000000   42.9 ns/op         0 B/op    0 allocs/op
BenchmarkStatusProgress/members=3/Status-8                  5000000    395 ns/op        784 B/op    3 allocs/op
BenchmarkStatusProgress/members=3/Status-example-8          3000000    449 ns/op        784 B/op    3 allocs/op
BenchmarkStatusProgress/members=3/StatusWithoutProgress-8   100000000  18.7 ns/op         0 B/op    0 allocs/op
BenchmarkStatusProgress/members=3/WithProgress-8            20000000   78.1 ns/op         0 B/op    0 allocs/op
BenchmarkStatusProgress/members=3/WithProgress-example-8    20000000   70.7 ns/op         0 B/op    0 allocs/op
BenchmarkStatusProgress/members=5/Status-8                  3000000    470 ns/op        784 B/op    3 allocs/op
BenchmarkStatusProgress/members=5/Status-example-8          3000000    544 ns/op        784 B/op    3 allocs/op
BenchmarkStatusProgress/members=5/StatusWithoutProgress-8   100000000  19.7 ns/op         0 B/op    0 allocs/op
BenchmarkStatusProgress/members=5/WithProgress-8            20000000   105 ns/op          0 B/op    0 allocs/op
BenchmarkStatusProgress/members=5/WithProgress-example-8    20000000   94.0 ns/op         0 B/op    0 allocs/op
BenchmarkStatusProgress/members=100/Status-8                100000     11903 ns/op    22663 B/op   12 allocs/op
BenchmarkStatusProgress/members=100/Status-example-8        100000     13330 ns/op    22669 B/op   12 allocs/op
BenchmarkStatusProgress/members=100/StatusWithoutProgress-8 50000000   20.9 ns/op         0 B/op    0 allocs/op
BenchmarkStatusProgress/members=100/WithProgress-8          1000000    1731 ns/op         0 B/op    0 allocs/op
BenchmarkStatusProgress/members=100/WithProgress-example-8  1000000    1571 ns/op         0 B/op    0 allocs/op
```
2018-12-06 19:02:48 +01:00
bfa2c15c55 Merge pull request #10307 from gyuho/update-image
docs/img: clarify "network partition + membership reconfiguration"
2018-12-06 09:28:36 -08:00
bfaae1ba46 raft: enter ProgressStateReplica immediately after snapshot
When a follower requires a snapshot and the snapshot is sent at the
committed (and last) index in an otherwise idle Raft group, the follower
would previously remain in ProgressStateProbe even though it had been
caught up completely.

In a busy Raft group this wouldn't be an issue since the next round of
MsgApp would update the state, but in an idle group there's nothing
that rectifies the status (since there's nothing to append or update).

The reason this matters is that the state is exposed through
`RaftStatus()`. Concretely, in CockroachDB, we use the Raft status to
make sharding decisions (since it's dangerous to make rapid changes
to a fragile Raft group), and had to work around this problem[1].

[1]: 91b11dae41/pkg/storage/split_delay_helper.go (L138-L141)
2018-12-06 11:09:59 +01:00
0b5a3d8080 docs/img: clarify "network partition + membership reconfiguration"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-12-05 19:05:26 -08:00
510ae3d2a2 clientv3/concurrency: fix govet errors
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-12-05 18:28:13 -08:00
12b19ff5ea Merge pull request #10306 from johncming/zap_logger
embed: set log-outputs 'default' to 'stderr' config when zap mode
2018-12-05 18:25:43 -08:00
6744c57de3 embed: set log-outputs 'default' to 'stderr' config when zap mode 2018-12-06 09:33:51 +08:00
1124ccf4f9 Merge pull request #10305 from johncming/doc
Documentation/op-guide: fix typo.
2018-12-05 16:55:15 -08:00
e4ac8db4ae Documentation/op-guide: fix typo. 2018-12-06 08:48:30 +08:00
83696d95c9 Merge pull request #10297 from thrawn01/election-resume
concurrency: set keyPrefix when calling ResumeElection()
2018-12-05 13:25:39 -08:00
7d7266d3c9 concurrency: set keyPrefix when calling ResumeElection()
This change allows users to use Observe() after resuming an election as
Observe() requires a valid keyPrefix to function properly.
2018-12-05 14:50:56 -06:00
3f01426712 Merge pull request #10301 from jingyih/add_timeout_to_etcdctl_snapshot_save
etcdctl: add timeout to snapshot save command
2018-12-04 21:23:15 -08:00
ce3e127d0b Merge pull request #10302 from jingyih/update_changelog_3p4_from_pr10301
CHANGELOG-3.4: update from #10301
2018-12-04 21:19:50 -08:00
9db21117d1 CHANGELOG-3.4: update from #10301 2018-12-04 21:06:46 -08:00
5b6b03d081 etcdctl: add timeout to snapshot save command
snapshot save command by default has no timeout, but respects user
specified command timeout.
2018-12-04 20:36:48 -08:00
7f450bf696 Merge pull request #10296 from gyuho/learner
docs: rename to "learner" (from "non-voting member")
2018-12-03 11:17:26 -08:00
bfd9596352 docs: rename to "learner" (from "non-voting member")
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-12-03 10:38:40 -08:00
99c933b7bc Merge pull request #10295 from gyuho/non-voting-member-design
docs: add "Non-voting member" design doc
2018-12-03 10:27:24 -08:00
1f559736ef docs: add authors to client architecture doc
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-12-03 09:12:51 -08:00
285a3acda6 docs: add "Non-voting member" design doc
By gyuho and jpbetz.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-12-03 09:00:08 -08:00
dedae6eb7c CHANGELOG-3.4: add backend batch limit/interval fields
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-11-28 14:03:05 -08:00
829c9b2129 Merge pull request #10283 from xiang90/more_flags
*: add flags to setup backend related config
2018-11-27 18:58:43 -08:00
3faed211e5 *: add flags to setup backend related config 2018-11-26 15:50:26 -08:00
6c649de36e Merge pull request #10281 from tbg/print-hint-reject-app-resp
raft: print RejectHint of zero on MsgAppResp
2018-11-24 11:48:16 +08:00
f28945ba8a Merge pull request #10279 from tbg/leader-progress-replicate
raft: ensure leader is in ProgressStateReplicate
2018-11-24 10:43:01 +08:00
5c209d66d2 raft: ensure leader is in ProgressStateReplicate
The leader perpetually kept itself in ProgressStateProbe even though of
course it has perfect knowledge of its log. This wasn't usually an issue
because it also doesn't care about its own Progress, but it's better to
keep this data correctly maintained, especially since this is part of
raft.Status and thus becomes visible to applications using the Raft
library.

(Concretely, in CockroachDB we use the Progress to inform log
truncations).
2018-11-23 17:57:36 +01:00
1569f4829d raft: print RejectHint of zero on MsgAppResp
A zero RejectHint on MsgAppResp is still used, and so should be
reflected in the message description.
2018-11-23 11:06:38 +01:00
02a9810a9e Merge pull request #10275 from hexfusion/maint
MAINTAINERS: add Sam Batschelet
2018-11-20 17:38:29 -05:00
b4d200174e Merge pull request #10278 from cwdsuzhou/master
Documentation: add ENV variable ETCD_CIPHER_SUITES description
2018-11-20 10:57:51 -05:00
a8293e5815 Documentation: add ENV variable ETCD_CIPHER_SUITES description
Fixes #10277
2018-11-20 22:40:24 +08:00
0e8981d2ff MAINTAINERS: add Sam Batschelet
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2018-11-19 19:43:15 -05:00
bb25891960 Merge pull request #10268 from gyuho/dump-db
tools/etcd-dump-db: add "--timeout" flag
2018-11-15 23:16:24 -08:00
f8a513ce65 tools/etcd-dump-db: add "--timeout" flag
By default, wait up to 10 seconds to obtain flocks.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-11-15 22:50:32 -08:00
af893d3549 CHANGELOG-3.4: add "etcd_cluster_version"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-11-15 08:17:57 -08:00
024f3dfc82 Merge pull request #10257 from gyuho/cluster-version
etcdserver/*: add "etcd_cluster_version" metric
2018-11-15 22:02:57 +08:00
322339bb54 Merge pull request #10262 from johncming/tcpproxy
tcpproxy: add a test.
2018-11-15 22:02:05 +08:00
c2d023ce74 Merge pull request #10263 from johncming/raftstorage
raft:  add a test case in TestStorageAppend
2018-11-15 22:01:06 +08:00
fa92397e18 Merge pull request #10258 from ajwerner/ajwerner/raft_committed_entries_size
raft: separate MaxCommittedSizePerReady config from MaxSizePerMsg
2018-11-15 22:00:00 +08:00
9668536124 raft: add a test case in TestStorageAppend 2018-11-15 16:41:36 +08:00
92cb9c3295 tcpproxy: add a test. 2018-11-15 14:28:30 +08:00
e4af2be5bb raft: separate MaxCommittedSizePerReady config from MaxSizePerMsg
Prior to this change, MaxSizePerMsg was used both to cap the total byte size of
entries in messages as well as the total byte size of entries passed through
CommittedEntries in the Ready struct. This change adds a new Config parameter
MaxCommittedSizePerReady which defaults to MaxSizePerMsg and contols the second
of above descibed settings.
2018-11-14 09:59:09 -05:00
0226481584 tests/e2e: test cluster version
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-11-13 21:49:33 -08:00
291768af0f etcdserver/*: add "etcd_cluster_version" metric
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-11-13 21:49:12 -08:00
ee9dcbca0d Merge pull request #10256 from philippgille/patch-1
Fix spelling in comment
2018-11-12 14:07:05 -08:00
0cfc01b873 clientv3: Fix spelling in comment 2018-11-12 22:33:48 +01:00
91e583cba6 etcdserver: Remove infinite loop in doSerialize
Once chk(ai) fails with auth.ErrAuthOldRevision it will always do,
regardless how many times you retry. So the error is better be returned
to fail the pending request and make the client re-authenticate.
2018-11-12 23:28:24 +03:00
ae25c5e132 Merge pull request #10253 from mas9612/improve-docs
*: fix some typo and repo name
2018-11-11 11:41:57 -08:00
9ee41f699c doc: fix typo in documentation 2018-11-11 23:27:04 +09:00
963d76fc4c CONTRIBUTING: fix repo name 2018-11-11 23:25:53 +09:00
e34423b6ff Merge pull request #10251 from ueokande/fix-github-links
Fix github links
2018-11-09 18:28:42 -08:00
aa4313a55a *: fix github links 2018-11-10 11:14:18 +09:00
9454c4cab0 *: add client support for discovery-srv-name
Add support for --discovery-srv-name flag to etcdctl, gRPC proxy, and etcd gateway.
2018-11-09 16:30:58 -05:00
e0f7807f1b CHANGELOG: highlight improved discovery-srv-name support
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2018-11-09 14:51:39 -05:00
61c8d7a582 tests/docker-dns-srv: add tests for docker-dns-srv-name.
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2018-11-09 13:51:04 -05:00
fa35126ef8 *: add client support for discovery-srv-name
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2018-11-09 10:13:04 -05:00
83304cfc80 tools/etcd-dump-metrics: add missing godoc
https://godoc.org/github.com/etcd-io/etcd/tools/etcd-dump-metrics

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-31 16:12:32 -07:00
c0e04700cf Merge pull request #10230 from manishrjain/master
raft: Explain ReportSnapshot and Propose behavior
2018-11-01 06:48:45 +08:00
8e907e48f9 Merge pull request #10229 from gyuho/fix-flag-tests
pkg/flags: fix "TestSetFlagsFromEnvParsingError"
2018-10-31 15:42:32 -07:00
4aa72ca1d3 raft: Explain ReportSnapshot and Propose behavior
Update godocs for node interface, explaining the behavior of ReportSnapshot and Propose.
2018-10-31 15:37:55 -07:00
7b32c07899 pkg/flags: fix "TestSetFlagsFromEnvParsingError"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-31 10:44:38 -07:00
3f3eae7822 Merge pull request #10227 from imjoey/doc_fix_start_gateway_command
Documentation/op-guide: fix error command to start gateway
2018-10-30 20:02:41 -07:00
bece069329 doc: Fix starting etcd gateway command using DNS. 2018-10-31 10:28:35 +08:00
be7c8fe423 docs: update "requests>=2.20.0"
To resolve the security alert https://nvd.nist.gov/vuln/detail/CVE-2018-18074.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-29 15:00:10 -07:00
252dc1f994 Merge pull request #10220 from ping40/pagewriter
pkg/ioutil: n is equal or greater than cw.writeBytes
2018-10-28 14:33:37 -07:00
5a94f97a4f pkg/ioutil: n is equal or greater than cw.writeBytes 2018-10-27 16:13:33 +08:00
583763261f Merge pull request #10216 from hexfusion/shellcheck
*: resolve shellcheck errors.
2018-10-26 06:15:44 -07:00
78d01140ff *: resolve shellcheck errors. 2018-10-25 19:41:11 -04:00
798955d4d6 Merge pull request #10209 from ping40/d1024
raft: Fix comment on TestLeaderBcastBeat
2018-10-25 14:42:16 -07:00
38da00be33 Merge pull request #10212 from gyuho/raft-typo
raft: fix godoc in tests
2018-10-25 09:51:35 -07:00
b7ed4165ea raft: fix godoc in tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-24 23:23:32 -07:00
965ba5ca8b Merge pull request #10203 from ping40/doc1022_2
raft: fix description in UT
2018-10-24 23:21:02 -07:00
5d8975d7ad Merge pull request #10207 from jpbetz/concurrent-new-client-fix
*: Fix concurrent clientv3 client creation in 3.4 by using proper UUIDs
2018-10-24 16:08:36 -07:00
ea68efb259 CHANGELOG: add "raft unbounded log growth prevention" PR
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-24 15:55:08 -07:00
10255cf196 raft: Fix comment on TestLeaderBcastBeat 2018-10-24 16:56:10 +08:00
86b933311d Merge pull request #10205 from gyuho/testing-prow
OWNERS: experiment
2018-10-22 16:07:27 -07:00
6a43db1eff clientv3: Fix concurrent clientv3 client creation in 3.4 by using proper UUIDs 2018-10-22 14:08:58 -07:00
2338f747bf vendor: Add dependency on google/uuid for safe UUID generation
etcd currently generates a few UUID style identifiers using approaches like `fmt.Sprintf("client-%s", strconv.FormatInt(time.Now().UnixNano(), 36))`.
But these can collide on machine architectures with larger timestamp steps (see https://github.com/etcd-io/etcd/issues/10035).
2018-10-22 14:06:24 -07:00
c561f8310e OWNERS: experiment
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-22 12:49:08 -07:00
b42b39446b Merge pull request #10199 from tschottdorf/fix-max-uncommitted-size
raft: fix bug in unbounded log growth prevention mechanism
2018-10-22 21:30:01 +02:00
a27a73e448 Merge pull request #10193 from johncming/master
mvcc/backend: code format optimization
2018-10-22 11:50:53 -07:00
ad49c8fd98 raft: fix bug in unbounded log growth prevention mechanism
The previous code was using the proto-generated `Size()` method to
track the size of an incoming proposal at the leader. This includes
the Index and Term, which were mutated after the call to `Size()`
when appending to the log. Additionally, it was not taking into
account that an ignored configuration change would ignore the
original proposal and append an empty entry instead.

As a result, a fully committed Raft group could end up with a non-
zero tracked uncommitted Raft log counter that would eventually hit
the ceiling and drop all future proposals indiscriminately. It would
also immediately imply that proposals exceeding the threshold alone
would get refused (as the "first uncommitted proposal" gets special
treatment and is always allowed in).

Track only the size of the payload actually appended to the Raft log
instead.

For context, see:
https://github.com/cockroachdb/cockroach/issues/31618#issuecomment-431374938
2018-10-22 11:28:39 +02:00
de470991e1 raft: fix description in UT 2018-10-22 13:59:50 +08:00
8c80efb886 CHANGELOG: highlight minimum recommended version, change github org URLs
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-18 16:39:01 -07:00
88e0830560 CHANGELOG-3.4: add "raft.Config.MaxUncommittedEntriesSize" change
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-18 11:47:50 -07:00
cf309757d6 mvcc/backend: code format optimization 2018-10-17 14:18:09 +08:00
7a759c18d2 Merge pull request #10178 from johncming/master
bugfix: use the backend create by snapshot instead of origin one in tests
2018-10-15 10:54:34 -07:00
1ded5aaf4d Merge pull request #10150 from cbeneke/fix/mixin-insufficent-member-alert
etcd-mixin: Fix EtcdInsufficientMembers alerting
2018-10-15 10:27:59 -07:00
c75ba98f81 Documentation/etcd-mixin: Fix EtcdInsufficientMembers alerting
Currently the EtcdInsufficientMembers alert fires, when more than (X/2)-1
instances are unavailable. This fixes it to fire at the correct limit of (X-1)/2
unavailable instances and $value now contains the number of available instances
instead of unavailable ones. Added unit test for EtcdInsufficientMembers alert.
2018-10-15 19:23:43 +02:00
bf49b9a145 mvcc/backend: fix to use the backend create by snapshot instead of origin one. 2018-10-15 09:35:20 +08:00
dac8c6fcc0 Merge pull request #10167 from nvanbenschoten/nvanbenschoten/limitUncommitted
raft: provide protection against unbounded Raft log growth
2018-10-13 23:52:28 -07:00
73c20cc1b7 raft: Fix comment on sendHeartbeat 2018-10-14 00:03:43 -04:00
7be7ac5a5d raft: Fix spelling in doc.go 2018-10-13 23:25:05 -04:00
f89b06dc6d raft: provide protection against unbounded Raft log growth
The suggested pattern for Raft proposals is that they be retried
periodically until they succeed. This turns out to be an issue
when a leader cannot commit entries because the leader will continue
to append re-proposed entries to its log without committing anything.
This can result in the uncommitted tail of a leader's log growing
without bound until it is able to commit entries.

This change add a safeguard to protect against this case where a
leader's log can grow without bound during loss of quorum scenarios.
It does so by introducing a new, optional ``MaxUncommittedEntriesSize
configuration. This config limits the max aggregate size of uncommitted
entries that may be appended to a leader's log. Once this limit
is exceeded, proposals will begin to return ErrProposalDropped
errors.

See cockroachdb/cockroach#27772
2018-10-13 23:25:05 -04:00
3c6c05be8a Merge pull request #10176 from jpbetz/keepalive-docs
clientv3: Clarify lessor KeepAlive docs
2018-10-12 09:38:51 -07:00
e205d09895 Merge pull request #10171 from paulf69487623/master
Documentation: Add the -N option to curl for the watch example to disable buffering
2018-10-12 07:09:14 -04:00
b3faeb5d86 Documentation: Add the -N option to curl for the watch example to disable buffering 2018-10-11 22:13:43 -05:00
1cab49ef78 Merge pull request #9718 from kchristidis/fix-snap-pub-error
raftexample: Fix publish snapshot error message
2018-10-11 16:45:55 -07:00
404f7d820c Merge pull request #10175 from wenjiaswe/fixTestMetricsHealth
integration: fix bug in TestMetricsHealth
2018-10-11 16:07:30 -07:00
49450aaa60 clientv3: Clarify lessor KeepAlive docs 2018-10-11 15:11:28 -07:00
69f53e1406 integration: fix bug in TestMetricsHealth 2018-10-11 14:55:39 -07:00
d5c93a7b0b Merge pull request #10165 from jpbetz/socket-docs
Document unix and unixs URL schemes
2018-10-10 15:21:55 -07:00
ef7e9d385b docs/operate.rst: link latest patch releases
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-10 15:03:51 -07:00
342d53d1b1 docs/metrics: add metrics outputs from patch releases
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-10 15:03:37 -07:00
f0736fe477 CHANGELOG: add Go release versions
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-10 11:07:10 -07:00
5b0960f664 docs/metrics: document missing metrics from master branch
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-09 18:37:41 -07:00
d4283b895c CHANGELOG-3.3: update release date for tomorrow
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-09 18:30:50 -07:00
0f0919c19c Merge pull request #10159 from gyuho/version-log
etcdserver: clear message in cluster version decision
2018-10-09 18:10:14 -07:00
3e37052c08 CHANGELOG: updates for v3.4 and patch releases
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-09 17:40:51 -07:00
1957d1cedf Documentation: Document unix and unixs URL schemes 2018-10-09 14:42:56 -07:00
d2a0f17b82 Merge pull request #10155 from gyuho/metrics-messages
rafthttp: probe all raft transports
2018-10-09 11:18:31 -07:00
ba606bf85e Merge pull request #10156 from gyuho/metrics-health
etcdserver: add "etcd_server_health_success/failures"
2018-10-09 00:10:57 -07:00
ac4754053d Merge pull request #10160 from etcd-io/jpbetz-patch-1
Update patch release list to reflect that 3.1 is maintained
2018-10-08 23:39:35 -07:00
0181609402 Merge pull request #10164 from jingyih/update_CHANGELOG
CHANGELOG: update from #10153
2018-10-08 18:47:54 -07:00
4a8693361a CHANGELOG: update from #10153 2018-10-08 17:15:59 -07:00
90c5968ee1 Merge pull request #10157 from gyuho/go
*: use Go 1.11.1 for testing
2018-10-08 16:35:09 -07:00
a3ae8df912 Merge pull request #10112 from gyuho/vendor
*: use Go 1.11 module for dependency management, replace "dep"
2018-10-08 16:34:51 -07:00
59dd78dde8 etcdserver: clear message in cluster version decision
Only leader can decide cluster version.
Clarify the logging that this local node is the leader.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-08 16:05:10 -07:00
b046a37256 Merge pull request #10153 from funny-falcon/fix-client-mutex-lock-10111
clientv3/concurrency.Mutex.Lock() - preserve invariant
2018-10-08 15:13:52 -07:00
7a0647ceb7 Documentation: Update patch release list to reflect that 3.1 is maintained 2018-10-08 13:33:07 -07:00
7c33e3d77b docs/metrics/latest: sync with master
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 17:52:44 -07:00
d28724a530 travis.yml: update Go version to 1.11.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 17:39:49 -07:00
2a8dc72899 Makefile: update default Go version
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 17:39:19 -07:00
7524cc6f4c integration: add "TestMetricsHealth"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 17:25:14 -07:00
601d8b4677 etcdserver/api/etcdhttp: remove unused "HandleHealth" function
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 17:16:18 -07:00
004e04a1d1 etcdserver/api/etcdhttp: add "etcd_server_health_success/failures"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 17:15:12 -07:00
884a8bd36b etcdserver/api/rafthttp: configure "streamProber" in tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 03:32:05 -07:00
7b1ef37054 etcdserver/api/rafthttp: probe all Raft messages' RTT
This PR adds another probing routine to monitor the connection
for Raft message transports. Previously, we only monitored
snapshot transports.

In our production cluster, we found one TCP connection had >8-sec
latencies to a remote peer, but "etcd_network_peer_round_trip_time_seconds"
metrics shows <1-sec latency distribution, which means etcd server
was not sampling enough while such latency spikes happen
outside of snapshot pipeline connection.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 03:28:54 -07:00
4a239070c8 etcdserver/api/rafthttp: display roundtripper name in warnings
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 03:14:42 -07:00
47cff4dfe5 etcdserver/api/rafthttp: rename to "pipelineProber"
Preliminary work to add prober to "streamRt"

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 03:13:10 -07:00
64e8b2e905 clientv3: concurrency.Mutex.Lock() - preserve invariant
Convenient invariant:
- if werr == nil then lock is supposed to be locked at the moment.

While we could not be confident in stronger invariant ('is exactly locked'),
it were inconvenient that previous code could return `werr == nil` after
Mutex.Unlock.

It could happen when ctx is canceled/timeouted exactly after waitDeletes
successfully returned werr == nil and before `<-ctx.Done()` checked.
While such situation is very rare, it is still possible.

fixes #10111
2018-10-05 14:17:32 +03:00
6976819792 Merge pull request #10148 from jingyih/add_unit_test_for_snapshot_file_integrity
clientv3: add test for snapshot status
2018-10-03 19:52:19 -07:00
87beb8336f clientv3: add test for snapshot status
Add unit test to check if we can correctly identify a corrupted snapshot
backup file.
2018-10-03 18:17:19 -07:00
2654de8a0e Merge pull request #10152 from jingyih/add_unfreed_to_goword_whitelist
words: whitelist unfreed
2018-10-03 18:16:43 -07:00
57c50b0d8c words: whitelist unfreed
whitelist keyword 'unfreed' for goword. It is from the bbolt error
message.
2018-10-03 18:12:10 -07:00
eca5f03cea Merge pull request #10149 from jingyih/fix_goword_checking_in_clientv3
clientv3: fix goword checking in config.go
2018-10-03 07:42:16 -07:00
7d57ee3427 clientv3: fix goword checking in config.go 2018-10-02 23:02:10 -07:00
bfdfaf5333 words: whitelist PermitWithoutStream
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-02 13:53:02 -07:00
1d1f509e98 Merge pull request #10146 from spzala/clientpermitwithoutstream
clientv3: let etcd client use all available keepalive ClientParams
2018-10-02 13:31:47 -07:00
1f5aea320a Merge pull request #10139 from DennisMao/patch-1
tools: fix building failures on Win
2018-10-02 13:30:49 -07:00
f6f375109e clientv3: let etcd client use all available keepalive ClientParams
We should allow etcd client use all of the available keepalive
client parameters as documented in this link,
https://godoc.org/google.golang.org/grpc/keepalive#ClientParameters
Currently in the etcd, by default PermitWithoutStream is set to
false, and user has no way to override it.
On the server side, we explicitely setting EnforcementPolicy
PermitWithoutStream to false and don't provide option to override it
to user but on the client side we should allow this option as
provided by the grpc.
2018-10-02 15:51:27 -04:00
b8969dea0b Merge pull request #10145 from ae6rt/issue/10142
benchmark: util.go
2018-10-02 10:45:15 -07:00
08e88c6693 Merge pull request #10063 from tschottdorf/fix-commit-pagination
raft: fix correctness bug in CommittedEntries pagination
2018-10-02 12:39:29 -04:00
95a282efb5 benchmark: util.go
allow client to setup TLS with cluster members, without the client having to offer TLS authentication itself

fixes #10142
2018-10-02 08:45:24 -07:00
051b119cd3 Merge pull request #10141 from essamhassan/9734_improve_auth_coverage
9734 improve auth coverage
2018-10-01 15:37:38 -07:00
ffbdb458a4 Auth: improve auth coverage
adds tests for uncovered auth funcs

Issue #9734
2018-10-01 10:25:38 +02:00
c74998267c *: change roadmap, future release dates
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-01 01:04:24 -07:00
f7dd524f6b CHANGELOG: add all missing changes from past months
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-01 01:02:50 -07:00
32a3a73755 tools: fix building failures on Windows
When building tools on Win,it shows `.\main.go:68:12: assignment mismatch: 2 variables but 1 values`.The reason is the return variables not match the calling from `main.go` and i try to fix it.
2018-09-29 11:17:49 +08:00
60fd69a06f Merge pull request #10138 from jingyih/update_changelog_3p3_from_pr9990
CHANGELOG-3.3: update from #9990
2018-09-28 18:08:15 -07:00
30fbcf5cc8 CHANGELOG-3.3: update from #9990 2018-09-28 17:36:08 -07:00
91fe8cbaa7 Merge pull request #10137 from jingyih/update_changelog_3p1_from_pr10109
CHANGELOG-3.1: update from #10109
2018-09-28 17:27:49 -07:00
31e2c08293 Merge pull request #10136 from jingyih/update_changelog_3p2_from_pr10109
CHANGELOG-3.2: update from #10109
2018-09-28 17:27:24 -07:00
4a3f7c6091 CHANGELOG-3.1: update from #10109 2018-09-28 17:24:41 -07:00
a249c19213 CHANGELOG-3.2: update from #10109 2018-09-28 17:19:48 -07:00
fa495d5605 Merge pull request #10134 from jingyih/update_changelog_3p4_from_pr10109
CHANGELOG-3.4: update from #10109
2018-09-28 17:16:58 -07:00
ab4703530e Merge pull request #10135 from jingyih/update_changelog_3p3_from_pr10109
CHANGELOG-3.3: update from #10109
2018-09-28 17:16:47 -07:00
fd816d071c CHANGELOG-3.3: update from #10109 2018-09-28 17:13:18 -07:00
11fc2cefc9 CHANGELOG-3.4: update from #10109 2018-09-28 16:51:29 -07:00
1a8c520979 ctlv3: fix gofmt
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-28 09:53:24 -07:00
a3e242d80a Merge pull request #10121 from dannysauer/etcdctl_message_fix
etcdctl: More helpful error handling in member add
2018-09-25 14:57:01 -07:00
36d227c9e5 etcdctl: Prettier error handling in member add
Maintain existing error message for not-enough-args
Add "too many args" if too many args
Add more helpful error message if v2 syntax was used

New output:
```
sauer@host:~/dev/etcd$ ./bin/etcdctl --endpoints http://localhost:5001 member add
Error: member name not provided.
sauer@host:~/dev/etcd$ ./bin/etcdctl --endpoints http://localhost:5001 member add node2 node2
Error: too many arguments
sauer@host:~/dev/etcd$ ./bin/etcdctl --endpoints http://localhost:5001 member add node2 http://localhost:6002
Error: too many arguments, did you mean "--peer-urls http://localhost:6002"
sauer@host:~/dev/etcd$ ./bin/etcdctl --endpoints http://localhost:5001 member add http://localhost:6002 node2
Error: too many arguments, did you mean "--peer-urls http://localhost:6002"
```
2018-09-25 16:50:57 -05:00
2cf4736621 Merge pull request #10117 from jingyih/improve_doc_on_etcdctl_lock
etcdctl: improve doc on etcdctl lock command
2018-09-24 15:22:13 -07:00
78d02f4229 etcdctl: improve doc on etcdctl lock command 2018-09-24 14:25:02 -07:00
fb674833c2 Merge pull request #10094 from nolouch/drop-read
server: drop read request if the leader is changed
2018-09-23 19:32:44 -07:00
12cfc5fce6 Merge pull request #10115 from hexfusion/fx_dvdoc
Documentation/dev-guide: add gRPC-gateway example of txn target VERSION syntax.
2018-09-22 23:34:31 -07:00
e935594d34 Documentation/dev-guide: add example of txn target VERSION syntax for gRPC-gateway. 2018-09-22 21:30:35 -04:00
d987caeb0d Merge pull request #10109 from jingyih/add_db_integrity_verification_to_snapshot_status
clientv3: add db integrity check in snapshot status
2018-09-21 10:42:24 -07:00
7d21cb24e9 bill-of-materials.json: regenerate
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-21 02:13:56 -07:00
d5967b40db scripts/updatedep: fix shellcheck errors
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-21 02:08:25 -07:00
67b0b0cd03 vendor: upgrade "go.uber.org/zap" to v1.9.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-21 01:51:59 -07:00
bea4f1337a go.mod: upgrade "go.uber.org/zap" to v1.9.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-21 01:51:53 -07:00
cd789fd8f6 vendor: revendor
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-21 01:48:10 -07:00
527ff8216d go.mod,sum: initial commit with Go 1.11 module
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-21 01:47:48 -07:00
ad2d18aeff scripts/updatedep: use Go 1.11 module for dependency management
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-21 01:47:24 -07:00
09d1426ac0 Gopkg: remove to use go module
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-21 01:47:04 -07:00
422f867f6b clientv3: add integrity check in snapshot status
Add snapshot file integrity check in snapshot status. If the file is
corrupted, return with error message.
2018-09-20 12:59:27 -07:00
1a3be73387 Merge pull request #10110 from zhongwencool/zhongwencool-patch-1
Documentation: add erlang client for etcd API v3
2018-09-20 12:41:33 -04:00
6a07f831d4 Documentation: add erlang client for etcd API v3 2018-09-20 23:49:48 +08:00
f32bc50765 Merge pull request #10105 from johncming/master
bugfix: use Rlock instead of Lock in raftexample.
2018-09-19 15:50:41 -07:00
b9b75f81e5 Merge pull request #10106 from petermattis/pmattis/ready-must-sync
raft: fix Ready.MustSync logic
2018-09-19 15:39:44 -07:00
66ee394527 raft: fix Ready.MustSync logic
The previous logic was erroneously setting Ready.MustSync to true when
the hard state had not changed because we were comparing an empty hard
state to the previous hard state. In combination with another misfeature
in CockroachDB (unnecessary writing of empty batches), this was causing
a steady stream of synchronous writes to disk.
2018-09-19 16:33:16 -04:00
dd6e579b84 raftexample: use Rlock instead of Lock in getsnapshot 2018-09-19 13:25:47 +08:00
ab544f2dde Merge pull request #10092 from ivuk/Documentation-fix-typo
Documentation/v2/metrics.md: Fix a typo
2018-09-18 10:32:42 -07:00
6ea54195a6 client/integration: try to fix tests 2018-09-18 01:44:57 +08:00
c15fb607f6 server: broadcast leader changed 2018-09-17 14:15:04 +08:00
c2845a3a9d Merge pull request #10097 from atlaskerr/documentation-annotate-logger
Documentation: Annotate --logger flag
2018-09-15 17:28:45 -07:00
952a4365ce Documentation: Annotate --logger flag
This commit annotates the `--logger` flag to let users know that it is
not available in versions 3.3.x or later.
2018-09-15 18:51:53 -05:00
fd5ef74b80 clientv3/integration: try to fix tests 2018-09-14 17:57:56 +08:00
f3f6427586 server: prevent blocking 2018-09-14 16:08:29 +08:00
4de27039cb server: drop read request if found leader changed 2018-09-14 15:58:35 +08:00
6ef7c5f462 Documentation/v2/metrics.md: Fix a typo
Typo fix: particlarly -> particularly
2018-09-13 21:05:32 +02:00
001bbb97cc Merge pull request #9999 from vimalk78/9377_document_election_api
clientv3/concurrency: Document Election API context parameter
2018-09-13 10:47:23 -07:00
90ecbabb7d Merge pull request #10086 from vimalk78/fix-Documentation-etcd-io
Documentation/*: change github url, import path from coreos to etcd-io
2018-09-13 10:15:53 -07:00
35fa6e13b9 clientv3/concurrency: Document Election API context parameter
fixes #9377
2018-09-13 12:15:01 +05:30
a7b1306ecf Merge pull request #10088 from ivuk/Documentation-fix-typo
Documentation/metrics.md: Fix a typo
2018-09-12 14:29:32 -07:00
09b7590c0b Documentation/metrics.md: Fix a typo
Typo fix: particlarly -> particularly
2018-09-12 23:19:21 +02:00
bda28c3ce2 Documentation/*: change github url, import path from coreos to etcd-io 2018-09-12 22:05:08 +05:30
ec5ff10436 Merge pull request #10019 from thrawn01/grpc-proxy-watch-errors
Improve watch error reporting when using grpc proxy
2018-09-10 12:54:33 -07:00
a18b467111 Merge pull request #10075 from gyuho/dash-etcd-io
Documentation/op-guide: remove "dash.etcd.io"
2018-09-06 18:11:47 -07:00
2fc06c8ec9 Documentation/op-guide: remove "dash.etcd.io"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-06 17:58:54 -07:00
2589353c1f Merge pull request #10074 from gyuho/raft-link
raft: fix link typo
2018-09-06 09:25:45 -07:00
c2b3c54370 raft: fix link typo
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-06 09:20:22 -07:00
1c80ce4b29 Merge pull request #10070 from philips/add-more-projects
Please read https://github.com/etcd-io/etcd/blob/master/CONTRIBUTING.md#contribution-flow.
2018-09-05 20:02:49 -07:00
085a5bd2f8 Documentation: add more projects 2018-09-05 19:55:31 -07:00
cf3f79a04d Merge pull request #10069 from philips/add-more-projects
Documentation: add new projects
2018-09-05 19:33:50 -07:00
16ad8b48bd Documentation: add new projects
A number of projects using etcd were missing. Add.
2018-09-05 19:30:08 -07:00
8b66718529 Merge pull request #10067 from hexfusion/fx_ddb_doc
tools/etcd-dump-db: fix iterate-bucket example.
2018-09-05 22:07:03 -04:00
d43f223e36 tools/etcd-dump-db: fix iterate-bucket example. 2018-09-05 20:41:42 -04:00
1df1ddff43 Merge pull request #10066 from mrIncompetent/use-testing-interface
Use testing.TB for integration test helpers to enable usage in benchmarks
2018-09-04 14:04:06 -07:00
2be5994f61 integration: Replace *testing.T with testing.TB
Use testing.TB for integration test helpers to enable usage in benchmarks
2018-09-04 22:34:40 +02:00
7a8ab37bfd raft: fix correctness bug in CommittedEntries pagination
In #9982, a mechanism to limit the size of `CommittedEntries` was
introduced. The way this mechanism worked was that it would load
applicable entries (passing the max size hint) and would emit a
`HardState` whose commit index was truncated to match the limitation
applied to the entries. Unfortunately, this was subtly incorrect
when the user-provided `Entries` implementation didn't exactly
match what Raft uses internally. Depending on whether a `Node` or
a `RawNode` was used, this would either lead to regressing the
HardState's commit index or outright forgetting to apply entries,
respectively.

Asking implementers to precisely match the Raft size limitation
semantics was considered but looks like a bad idea as it puts
correctness squarely in the hands of downstream users. Instead, this
PR removes the truncation of `HardState` when limiting is active
and tracks the applied index separately. This removes the old
paradigm (that the previous code tried to work around) that the
client will always apply all the way to the commit index, which
isn't true when commit entries are paginated.

See [1] for more on the discovery of this bug (CockroachDB's
implementation of `Entries` returns one more entry than Raft's when the
size limit hits).

[1]: https://github.com/cockroachdb/cockroach/issues/28918#issuecomment-418174448
2018-09-04 14:52:23 +02:00
6143c135bd Merge pull request #10060 from vimalk78/go.etcd.io-move-minor-changes
*: path changes for moving to github/etcd-io/etcd
2018-09-03 13:48:01 -07:00
bcde798fdd *: path changes for moving to github/etcd-io/etcd 2018-09-03 21:57:23 +05:30
ca7dc4ff26 Merge pull request #10058 from vimalk78/9882-contrib-queue-memleak
contrib/recipes/watch.go : cancel() the watch after desired watch event
2018-09-01 08:34:29 -07:00
9bad6fd442 contrib/recipes/watch.go : cancel() the watch after desired watch event
fixes #9882
2018-09-01 18:18:24 +05:30
206b012ed4 Merge pull request #10057 from jglick/patch-1
Use IPA notation for pronunciation of etcd
2018-08-31 07:10:53 -07:00
abd1b34ba6 docs: fix pronunciation notation 2018-08-31 07:09:46 -07:00
484f622034 Merge pull request #10054 from vcaesar/docs-pr
client,clientv3: update client docs to "go.etcd.io"
2018-08-30 16:44:16 -07:00
fc7ef659cc client,clientv3: update client docs to "go.etcd.io" 2018-08-30 19:26:12 -04:00
b6e6de54e9 Merge pull request #10053 from ueokande/fix-clientv3-doc
clientv3/doc: Fix code example
2018-08-30 15:28:33 -07:00
e9afd51f47 clientv3/doc: Fix code example 2018-08-30 22:10:58 +00:00
d627719881 Merge pull request #10050 from gyuho/bbolt
vendor: use "go.etcd.io/bbolt" v1.3.1-etcd.7
2018-08-30 10:20:53 -07:00
34fcabab55 CHANGELOG: add all recent changes
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:53:41 -07:00
572d486c5b *: update github links
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:28:36 -07:00
e235cd3302 Documentation: update github links
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:28:00 -07:00
474cc300c9 bill-of-materials: regenerate with "go.etcd.io/bbolt"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 13:33:05 -07:00
fa57f7fbc7 vendor: use "go.etcd.io/bbolt" v1.3.1-etcd.7
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 12:39:59 -07:00
790cc3cdd6 Gopkg.lock: use "go.etcd.io/bbolt" v1.3.1-etcd.7
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 12:34:22 -07:00
8db439d693 *: use "go.etcd.io/bbolt"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 12:31:28 -07:00
5adbc231f2 mvcc/backend: use "go.etcd.io/bbolt"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 12:31:04 -07:00
1a282a72be Merge pull request #10045 from gyuho/go-1.11
*: use Go 1.11 for tests
2018-08-28 23:02:43 -07:00
839f202195 clientv3/integration: fix race condition from closing channel
Go 1.11 now marks len(channel) over being-closed channel
as racey operation, fix tests by receiving from channel first
and then check the length of channel.

```
WARNING: DATA RACE
Write at 0x00c000e872c0 by goroutine 198:
  runtime.closechan()
      /usr/local/go/src/runtime/chan.go:327 +0x0
  go.etcd.io/etcd/clientv3.(*lessor).closeRequireLeader()
      /Users/leegyuho/go/src/go.etcd.io/etcd/clientv3/lease.go:379 +0x748
  go.etcd.io/etcd/clientv3.(*lessor).recvKeepAliveLoop()
      /Users/leegyuho/go/src/go.etcd.io/etcd/clientv3/lease.go:455 +0x3a5

Previous read at 0x00c000e872c0 by goroutine 27:
  go.etcd.io/etcd/clientv3/integration.TestLeaseWithRequireLeader()
      /Users/leegyuho/go/src/go.etcd.io/etcd/clientv3/integration/lease_test.go:828 +0x810
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:827 +0x162
```

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 22:27:11 -07:00
07fcc26799 *: fix gofmt warnings with Go 1.11
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 21:45:24 -07:00
8560221091 etcdserver: fix gofmt warnings with Go 1.11
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 21:45:12 -07:00
d24acedb5e travis.yml: bump up to Go 1.11
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 21:29:01 -07:00
a0cc409352 Makefile: bump up to Go 1.11
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 21:28:41 -07:00
e52b8611ea docs: update links to "go.etcd.io"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 20:54:35 -07:00
02b94fcc0d test: fix "license-bill-of-materials" command
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 20:53:28 -07:00
e8b940f268 Merge pull request #10044 from gyuho/rename-import-paths
*: rename import paths to "go.etcd.io/etcd"
2018-08-28 19:44:27 -07:00
8f3900ffe1 bill-of-materials.json: regenerate
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 18:43:18 -07:00
c0f6597c02 contrib/systemd: remove v2 service files
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 18:35:01 -07:00
0ef9ef3c74 *: rerun "gofmt"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 18:25:39 -07:00
379a1869c5 tests/Dockerfile: update import path to "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 18:07:41 -07:00
2774bb3212 Revert "*: regenerate proto"
This reverts commit 299178ceb9.
2018-08-28 17:56:14 -07:00
f2f7fc23f7 *: update github.com links
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:56 -07:00
299178ceb9 *: regenerate proto
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
f3af3d83c2 Makefile: update import paths to "go.etcd.io"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
198a931a0b MAINTAINERS: update import path to "go.etcd.io"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
a8a3efd27a scripts: update import paths in "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
c69c044026 Documentation: update import paths in "dl_build.md"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
7300bfdd1d build.ps1: update import paths to "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
d37f1521b7 *: update import paths to "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
2ac04381a2 clientv3: update Go import paths to "go.etcd.io"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
fced933294 auth: update Go import paths to "go.etcd.io"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
038fd844ac wal: update Go import paths to "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
d537b328cb mvcc: update import paths "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
1399bc69ce etcdserver: update import paths to "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
bb60f8ab1d raft: change import paths to "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:52 -07:00
47c04b959d build: upgrade local GOPATH
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:13:56 -07:00
591fdac832 travis.yml: update "go_import_path" to "go.etcd.io"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 16:59:45 -07:00
6c3a12f4c9 README: fix travis link
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-27 20:03:17 -07:00
548f715271 README: update repo URLs
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-27 19:19:36 -07:00
af85949b41 Merge pull request #10024 from gyuho/became-inactive
etcdserver/api/rafthttp: clarify "became inactive" warning
2018-08-24 22:12:16 -04:00
24ee22ab48 Merge pull request #10026 from gyuho/read-index
etcdserver: clarify read index wait timeout warnings
2018-08-24 22:11:58 -04:00
66c2665f66 Merge pull request #10037 from spzala/etcdctlrpc
etcdctl: fix small inconsistency with RPC ref
2018-08-24 18:51:55 -07:00
9ecc727dc8 Merge pull request #10036 from louis-paul/clientv3-error-handling-godoc
clientv3/doc: Update error handling godoc
2018-08-23 21:03:16 -04:00
a3c09ee8e9 clientv3/doc: Improve error handling comment 2018-08-23 19:47:27 +02:00
b4a670b571 etcdctl: fix small inconsistency with RPC
Several RPC references in README with memberlist only hyperlinked,
and since it's linked with line numbers it's not pointing to a
desired ref now. Remove the link to make it consistent and correct.
2018-08-23 10:48:32 -04:00
2ee4525cd2 clientv3/doc: Update error handling godoc
0c5bcd5d80 updated error handling for
`ErrEmptyKey` but missed the godoc.
2018-08-23 13:37:39 +02:00
a9fcc82e4d Merge pull request #10033 from zhenghc/master
Documentation: add Transwarp to production user
2018-08-23 07:17:06 -04:00
98d57b278a Documentation: add Transwarp to production user 2018-08-23 10:19:02 +08:00
bdc333359b Merge pull request #10031 from zhaohaidao/master
raft: fix typo in test
2018-08-22 12:39:14 -04:00
6ee880eb5b raft: fix typo in test 2018-08-22 23:48:47 +08:00
2921ab670f Merge pull request #10028 from spzala/geterror
etcdctl: fix get command error when no arg provided
2018-08-20 12:56:41 -07:00
7f001c1f00 etcdctl: fix get command error when no arg provided
The current error is incorrect. Providing a similar one to the del
command.
2018-08-19 19:54:18 -04:00
0a25f49d61 Merge pull request #10021 from jingyih/add_log_level_checking_in_logging_interceptor
etcdserver/api/v3rpc/interceptor: add log level checking
2018-08-17 18:08:01 -07:00
38711761a1 etcdserver: clarify read index wait timeout warnings
"read index" doesn't tell much about the root cause.
Most likely, the local follower node is having slow
network, thus timing out waiting to receive read
index response from leader.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-17 17:59:41 -07:00
156ff6461d etcdserver/api/rafthttp: clarify "became inactive" warning
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-17 17:45:53 -07:00
8d85259b56 etcdserver/api/v3rpc/interceptor: add log level checking
Check log level before generating and writing log info.
2018-08-17 16:12:05 -07:00
eb9c8d3c2f clientv3: return reason to user when server cancels watch
This change allows Watch users to retrieve the cancel reason when a
watch is canceled by the server. Additionally WatchResponses with
closeErr set now have Canceled set true which is in line with the
documentation for the Canceled field.
2018-08-16 23:11:02 -05:00
f0e6c10aba grpcproxy: return error to client during watch create
Now returns errors from checkPermissionForWatch() via the CancelReason
field. This allows clients to understand why the watch was canceled.
Additionally, this change protects a watch from starting and that
otherwise might hang indefinitely.
2018-08-16 23:10:02 -05:00
9431532730 raftexample: Fix publish snapshot error message
The error message is different to the condition it checks for. This PR
modifies the error message accordingly.

Closes #9717.

Signed-off-by: Kostas Christidis <kostas@christidis.io>
2018-07-26 11:14:15 -04:00
1812 changed files with 173059 additions and 203005 deletions

View File

@ -1,2 +1,2 @@
Please read https://github.com/coreos/etcd/blob/master/Documentation/reporting_bugs.md.
Please read https://github.com/etcd-io/etcd/blob/master/Documentation/reporting_bugs.md.

View File

@ -1,2 +1,2 @@
Please read https://github.com/coreos/etcd/blob/master/CONTRIBUTING.md#contribution-flow.
Please read https://github.com/etcd-io/etcd/blob/master/CONTRIBUTING.md#contribution-flow.

2
.github/SECURITY.md vendored Normal file
View File

@ -0,0 +1,2 @@
Please read https://github.com/etcd-io/etcd/blob/master/security/README.md.

2
.gitignore vendored
View File

@ -13,6 +13,8 @@
*.test
hack/tls-setup/certs
.idea
/contrib/raftexample/raftexample
/contrib/raftexample/raftexample-*
# TODO: use dep prune
# https://github.com/golang/dep/issues/120#issuecomment-306518546

View File

@ -1,13 +1,12 @@
language: go
go_import_path: github.com/coreos/etcd
go_import_path: go.etcd.io/etcd
sudo: required
services: docker
go:
- 1.10.3
- tip
- 1.12.9
notifications:
on_success: never
@ -23,50 +22,21 @@ env:
- TARGET=linux-amd64-unit
- TARGET=all-build
- TARGET=linux-amd64-grpcproxy
- TARGET=linux-amd64-coverage
- TARGET=linux-amd64-fmt-unit-go-tip
- TARGET=linux-386-unit
matrix:
fast_finish: true
allow_failures:
- go: 1.10.3
- go: 1.12.9
env: TARGET=linux-amd64-grpcproxy
- go: 1.10.3
env: TARGET=linux-amd64-coverage
- go: tip
env: TARGET=linux-amd64-fmt-unit-go-tip
- go: 1.10.3
env: TARGET=linux-386-unit
exclude:
- go: tip
env: TARGET=linux-amd64-fmt
- go: tip
env: TARGET=linux-amd64-integration-1-cpu
- go: tip
env: TARGET=linux-amd64-integration-2-cpu
- go: tip
env: TARGET=linux-amd64-integration-4-cpu
- go: tip
env: TARGET=linux-amd64-functional
- go: tip
env: TARGET=linux-amd64-unit
- go: tip
env: TARGET=all-build
- go: tip
env: TARGET=linux-amd64-grpcproxy
- go: tip
env: TARGET=linux-amd64-coverage
- go: 1.10.3
env: TARGET=linux-amd64-fmt-unit-go-tip
- go: tip
- go: 1.12.9
env: TARGET=linux-386-unit
before_install:
- if [[ $TRAVIS_GO_VERSION == 1.* ]]; then docker pull gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION}; fi
install:
- go get -t -d ./...
- go get -t -v -d ./...
script:
- echo "TRAVIS_GO_VERSION=${TRAVIS_GO_VERSION}"
@ -74,37 +44,37 @@ script:
case "${TARGET}" in
linux-amd64-fmt)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
--volume=`pwd`:/go/src/go.etcd.io/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=amd64 PASSES='fmt bom dep' ./test"
;;
linux-amd64-integration-1-cpu)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
--volume=`pwd`:/go/src/go.etcd.io/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=amd64 CPU=1 PASSES='integration' ./test"
;;
linux-amd64-integration-2-cpu)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
--volume=`pwd`:/go/src/go.etcd.io/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=amd64 CPU=2 PASSES='integration' ./test"
;;
linux-amd64-integration-4-cpu)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
--volume=`pwd`:/go/src/go.etcd.io/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=amd64 CPU=4 PASSES='integration' ./test"
;;
linux-amd64-functional)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
--volume=`pwd`:/go/src/go.etcd.io/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "./build && GOARCH=amd64 PASSES='functional' ./test"
;;
linux-amd64-unit)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
--volume=`pwd`:/go/src/go.etcd.io/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=amd64 PASSES='unit' ./test"
;;
all-build)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
--volume=`pwd`:/go/src/go.etcd.io/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=amd64 PASSES='build' ./test \
&& GOARCH=386 PASSES='build' ./test \
&& GO_BUILD_FLAGS='-v' GOOS=darwin GOARCH=amd64 ./build \
@ -116,15 +86,9 @@ script:
linux-amd64-grpcproxy)
sudo HOST_TMP_DIR=/tmp TEST_OPTS="PASSES='build grpcproxy'" make docker-test
;;
linux-amd64-coverage)
sudo HOST_TMP_DIR=/tmp make docker-test-coverage
;;
linux-amd64-fmt-unit-go-tip)
GOARCH=amd64 PASSES='fmt unit' ./test
;;
linux-386-unit)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
--volume=`pwd`:/go/src/go.etcd.io/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=386 PASSES='unit' ./test"
;;
esac

9
.words
View File

@ -8,7 +8,8 @@ MiB
ResourceExhausted
RPC
RPCs
parsedTarget
SRV
WithRequireLeader
InfoLevel
args
@ -94,12 +95,16 @@ jitter
WithBackoff
BackoffLinearWithJitter
jitter
WithDialer
WithMax
ServerStreams
BidiStreams
transientFailure
BackoffFunc
CallOptions
PermitWithoutStream
__lostleader
ErrConnClosing
unfreed
grpcAddr
clientURLs

View File

@ -1,10 +0,0 @@
## [v2.3.8](https://github.com/coreos/etcd/releases/tag/v2.3.8) (2017-02-17)
See [code changes](https://github.com/coreos/etcd/compare/v2.3.7...v2.3.8).
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).

View File

@ -1,203 +0,0 @@
## [v3.0.16](https://github.com/coreos/etcd/releases/tag/v3.0.16) (2016-11-13)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.15...v3.0.16) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Go
- Compile with [*Go 1.6.4*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.15](https://github.com/coreos/etcd/releases/tag/v3.0.15) (2016-11-11)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.14...v3.0.15) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Fixed
- Fix cancel watch request with wrong range end.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.14](https://github.com/coreos/etcd/releases/tag/v3.0.14) (2016-11-04)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.13...v3.0.14) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Added
- v3 `etcdctl migrate` command now supports `--no-ttl` flag to discard keys on transform.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.13](https://github.com/coreos/etcd/releases/tag/v3.0.13) (2016-10-24)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.12...v3.0.13) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.12](https://github.com/coreos/etcd/releases/tag/v3.0.12) (2016-10-07)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.11...v3.0.12) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.11](https://github.com/coreos/etcd/releases/tag/v3.0.11) (2016-10-07)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.10...v3.0.11) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Added
- Server returns previous key-value (optional)
- `clientv3.WithPrevKV` option
- v3 etcdctl `put,watch,del --prev-kv` flag
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.10](https://github.com/coreos/etcd/releases/tag/v3.0.10) (2016-09-23)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.9...v3.0.10) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.9](https://github.com/coreos/etcd/releases/tag/v3.0.9) (2016-09-15)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.8...v3.0.9) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Added
- Warn on domain names on listen URLs (v3.2 will reject domain names).
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.8](https://github.com/coreos/etcd/releases/tag/v3.0.8) (2016-09-09)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.7...v3.0.8) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Other
- Allow only IP addresses in listen URLs (domain names are rejected).
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.7](https://github.com/coreos/etcd/releases/tag/v3.0.7) (2016-08-31)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.6...v3.0.7) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Other
- SRV records only allow A records (RFC 2052).
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.6](https://github.com/coreos/etcd/releases/tag/v3.0.6) (2016-08-19)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.5...v3.0.6) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.5](https://github.com/coreos/etcd/releases/tag/v3.0.5) (2016-08-19)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.4...v3.0.5) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Other
- SRV records (e.g., infra1.example.com) must match the discovery domain (i.e., example.com) if no custom certificate authority is given.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.4](https://github.com/coreos/etcd/releases/tag/v3.0.4) (2016-07-27)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.3...v3.0.4) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Added
- v2 `etcdctl ls` command now supports `--output=json`.
- Add /var/lib/etcd directory to etcd official Docker image.
### Other
- v2 auth can now use common name from TLS certificate when `--client-cert-auth` is enabled.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.3](https://github.com/coreos/etcd/releases/tag/v3.0.3) (2016-07-15)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.2...v3.0.3) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Other
- Revert Dockerfile to use `CMD`, instead of `ENTRYPOINT`, to support `etcdctl` run.
- Docker commands for v3.0.2 won't work without specifying executable binary paths.
- v3 etcdctl default endpoints are now `127.0.0.1:2379`.
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.2](https://github.com/coreos/etcd/releases/tag/v3.0.2) (2016-07-08)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.1...v3.0.2) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Other
- Dockerfile uses `ENTRYPOINT`, instead of `CMD`, to run etcd without binary path specified.
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.1](https://github.com/coreos/etcd/releases/tag/v3.0.1) (2016-07-01)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.0...v3.0.1) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
## [v3.0.0](https://github.com/coreos/etcd/releases/tag/v3.0.0) (2016-06-30)
See [code changes](https://github.com/coreos/etcd/compare/v2.3.0...v3.0.0) and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_0.md).**
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).

View File

@ -1,405 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.0](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.0.md).
## [v3.1.19](https://github.com/coreos/etcd/releases/tag/v3.1.19) (2018-07-24)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.18...v3.1.19) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### Improved
- Improve [Raft Read Index timeout warning messages](https://github.com/coreos/etcd/pull/9897).
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-1) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_go_version`](https://github.com/coreos/etcd/pull/9957) Prometheus metric.
- Add [`etcd_server_slow_read_indexes_total`](https://github.com/coreos/etcd/pull/9897) Prometheus metric.
- Add [`etcd_server_quota_backend_bytes`](https://github.com/coreos/etcd/pull/9820) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
- Add [`etcd_mvcc_db_total_size_in_bytes`](https://github.com/coreos/etcd/pull/9819) Prometheus metric.
- In addition to [`etcd_debugging_mvcc_db_total_size_in_bytes`](https://github.com/coreos/etcd/pull/9819).
- Add [`etcd_mvcc_db_total_size_in_use_in_bytes`](https://github.com/coreos/etcd/pull/9256) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
### client v3
- Fix [lease keepalive interval updates when response queue is full](https://github.com/coreos/etcd/pull/9952).
- If `<-chan *clientv3LeaseKeepAliveResponse` from `clientv3.Lease.KeepAlive` was never consumed or channel is full, client was [sending keepalive request every 500ms](https://github.com/coreos/etcd/issues/9911) instead of expected rate of every "TTL / 3" duration.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.1.18](https://github.com/coreos/etcd/releases/tag/v3.1.18) (2018-06-15)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.17...v3.1.18) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-1) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_version`](https://github.com/coreos/etcd/pull/8960) Prometheus metric.
- To replace [Kubernetes `etcd-version-monitor`](https://github.com/coreos/etcd/issues/8948).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.1.17](https://github.com/coreos/etcd/releases/tag/v3.1.17) (2018-06-06)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.16...v3.1.17) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### etcd server
- Fix [v3 snapshot recovery](https://github.com/coreos/etcd/issues/7628).
- A follower receives a leader snapshot to be persisted as a `[SNAPSHOT-INDEX].snap.db` file on disk.
- Now, server [ensures that the incoming snapshot be persisted on disk before loading it](https://github.com/coreos/etcd/pull/7876).
- Otherwise, index mismatch happens and triggers server-side panic (e.g. newer WAL entry with outdated snapshot index).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.1.16](https://github.com/coreos/etcd/releases/tag/v3.1.16) (2018-05-31)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.15...v3.1.16) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### etcd server
- Fix [`mvcc` server panic from restore operation](https://github.com/coreos/etcd/pull/9775).
- Let's assume that a watcher had been requested with a future revision X and sent to node A that became network-partitioned thereafter. Meanwhile, cluster makes progress. Then when the partition gets removed, the leader sends a snapshot to node A. Previously if the snapshot's latest revision is still lower than the watch revision X, **etcd server panicked** during snapshot restore operation.
- Now, this server-side panic has been fixed.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.1.15](https://github.com/coreos/etcd/releases/tag/v3.1.15) (2018-05-09)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.14...v3.1.15) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### etcd server
- Purge old [`*.snap.db` snapshot files](https://github.com/coreos/etcd/pull/7967).
- Previously, etcd did not respect `--max-snapshots` flag to purge old `*.snap.db` files.
- Now, etcd purges old `*.snap.db` files to keep maximum `--max-snapshots` number of files on disk.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.1.14](https://github.com/coreos/etcd/releases/tag/v3.1.14) (2018-04-24)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.13...v3.1.14) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-1) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_is_leader`](https://github.com/coreos/etcd/pull/9587) Prometheus metric.
### etcd server
- Add [`--initial-election-tick-advance`](https://github.com/coreos/etcd/pull/9591) flag to configure initial election tick fast-forward.
- By default, `--initial-election-tick-advance=true`, then local member fast-forwards election ticks to speed up "initial" leader election trigger.
- This benefits the case of larger election ticks. For instance, cross datacenter deployment may require longer election timeout of 10-second. If true, local node does not need wait up to 10-second. Instead, forwards its election ticks to 8-second, and have only 2-second left before leader election.
- Major assumptions are that: cluster has no active leader thus advancing ticks enables faster leader election. Or cluster already has an established leader, and rejoining follower is likely to receive heartbeats from the leader after tick advance and before election timeout.
- However, when network from leader to rejoining follower is congested, and the follower does not receive leader heartbeat within left election ticks, disruptive election has to happen thus affecting cluster availabilities.
- Now, this can be disabled by setting `--initial-election-tick-advance=false`.
- Disabling this would slow down initial bootstrap process for cross datacenter deployments. Make tradeoffs by configuring `--initial-election-tick-advance` at the cost of slow initial bootstrap.
- If single-node, it advances ticks regardless.
- Address [disruptive rejoining follower node](https://github.com/coreos/etcd/issues/9333).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.1.13](https://github.com/coreos/etcd/releases/tag/v3.1.13) (2018-03-29)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.12...v3.1.13) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### Improved
- Adjust [election timeout on server restart](https://github.com/coreos/etcd/pull/9415) to reduce [disruptive rejoining servers](https://github.com/coreos/etcd/issues/9333).
- Previously, etcd fast-forwards election ticks on server start, with only one tick left for leader election. This is to speed up start phase, without having to wait until all election ticks elapse. Advancing election ticks is useful for cross datacenter deployments with larger election timeouts. However, it was affecting cluster availability if the last tick elapses before leader contacts the restarted node.
- Now, when etcd restarts, it adjusts election ticks with more than one tick left, thus more time for leader to prevent disruptive restart.
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-1) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add missing [`etcd_network_peer_sent_failures_total` count](https://github.com/coreos/etcd/pull/9437).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.1.12](https://github.com/coreos/etcd/releases/tag/v3.1.12) (2018-03-08)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.11...v3.1.12) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### etcd server
- Fix [`mvcc` "unsynced" watcher restore operation](https://github.com/coreos/etcd/pull/9297).
- "unsynced" watcher is watcher that needs to be in sync with events that have happened.
- That is, "unsynced" watcher is the slow watcher that was requested on old revision.
- "unsynced" watcher restore operation was not correctly populating its underlying watcher group.
- Which possibly causes [missing events from "unsynced" watchers](https://github.com/coreos/etcd/issues/9086).
- A node gets network partitioned with a watcher on a future revision, and falls behind receiving a leader snapshot after partition gets removed. When applying this snapshot, etcd watch storage moves current synced watchers to unsynced since sync watchers might have become stale during network partition. And reset synced watcher group to restart watcher routines. Previously, there was a bug when moving from synced watcher group to unsynced, thus client would miss events when the watcher was requested to the network-partitioned node.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.1.11](https://github.com/coreos/etcd/releases/tag/v3.1.11) (2017-11-28)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.10...v3.1.11) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### etcd server
- [#8411](https://github.com/coreos/etcd/issues/8411),[#8806](https://github.com/coreos/etcd/pull/8806) backport "mvcc: sending events after restore"
- [#8009](https://github.com/coreos/etcd/issues/8009),[#8902](https://github.com/coreos/etcd/pull/8902) backport coreos/bbolt v1.3.1-coreos.5
### Go
- Compile with [*Go 1.8.5*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.1.10](https://github.com/coreos/etcd/releases/tag/v3.1.10) (2017-07-14)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.9...v3.1.10) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### Added
- Tag docker images with minor versions.
- e.g. `docker pull quay.io/coreos/etcd:v3.1` to fetch latest v3.1 versions.
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
- Fix panic on `net/http.CloseNotify`
## [v3.1.9](https://github.com/coreos/etcd/releases/tag/v3.1.9) (2017-06-09)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.8...v3.1.9) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### etcd server
- Allow v2 snapshot over 512MB.
### Go
- Compile with [*Go 1.7.6*](https://golang.org/doc/devel/release.html#go1.7).
## [v3.1.8](https://github.com/coreos/etcd/releases/tag/v3.1.8) (2017-05-19)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.7...v3.1.8) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
## [v3.1.7](https://github.com/coreos/etcd/releases/tag/v3.1.7) (2017-04-28)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.6...v3.1.7) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
## [v3.1.6](https://github.com/coreos/etcd/releases/tag/v3.1.6) (2017-04-19)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.5...v3.1.6) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### etcd server
- Fill in Auth API response header.
- Remove auth check in Status API.
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
## [v3.1.5](https://github.com/coreos/etcd/releases/tag/v3.1.5) (2017-03-27)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.4...v3.1.5) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### etcd server
- Fix raft memory leak issue.
- Fix Windows file path issues.
### Other
- Add `/etc/nsswitch.conf` file to alpine-based Docker image.
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
## [v3.1.4](https://github.com/coreos/etcd/releases/tag/v3.1.4) (2017-03-22)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.3...v3.1.4) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
## [v3.1.3](https://github.com/coreos/etcd/releases/tag/v3.1.3) (2017-03-10)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.2...v3.1.3) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### etcd gateway
- Fix `etcd gateway` schema handling in DNS discovery.
- Fix sd_notify behaviors in `gateway`, `grpc-proxy`.
### gRPC Proxy
- Fix sd_notify behaviors in `gateway`, `grpc-proxy`.
### Other
- Use machine default host when advertise URLs are default values(`localhost:2379,2380`) AND if listen URL is `0.0.0.0`.
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
## [v3.1.2](https://github.com/coreos/etcd/releases/tag/v3.1.2) (2017-02-24)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.1...v3.1.2) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### etcd gateway
- Fix `etcd gateway` with multiple endpoints.
### Other
- Use IPv4 default host, by default (when IPv4 and IPv6 are available).
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
## [v3.1.1](https://github.com/coreos/etcd/releases/tag/v3.1.1) (2017-02-17)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.0...v3.1.1) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
## [v3.1.0](https://github.com/coreos/etcd/releases/tag/v3.1.0) (2017-01-20)
See [code changes](https://github.com/coreos/etcd/compare/v3.0.0...v3.1.0) and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_1.md).**
### Improved
- Faster linearizable reads (implements Raft [read-index](https://github.com/coreos/etcd/pull/6212)).
- v3 authentication API is now stable.
### Breaking Changes
- Deprecated following gRPC metrics in favor of [go-grpc-prometheus](https://github.com/grpc-ecosystem/go-grpc-prometheus).
- `etcd_grpc_requests_total`
- `etcd_grpc_requests_failed_total`
- `etcd_grpc_active_streams`
- `etcd_grpc_unary_requests_duration_seconds`
### Dependency
- Upgrade [`github.com/ugorji/go/codec`](https://github.com/ugorji/go) to [**`ugorji/go@9c7f9b7`**](https://github.com/ugorji/go/commit/9c7f9b7a2bc3a520f7c7b30b34b7f85f47fe27b6), and [regenerate v2 `client`](https://github.com/coreos/etcd/pull/6945).
### Security, Authentication
See [security doc](https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md) for more details.
- SRV records (e.g., infra1.example.com) must match the discovery domain (i.e., example.com) if no custom certificate authority is given.
- `TLSConfig.ServerName` is ignored with user-provided certificates for backwards compatibility; to be deprecated.
- For example, `etcd --discovery-srv=example.com` will only authenticate peers/clients when the provided certs have root domain `example.com` as an entry in Subject Alternative Name (SAN) field.
### etcd server
- Automatic leadership transfer when leader steps down.
- etcd flags
- `--strict-reconfig-check` flag is set by default.
- Add `--log-output` flag.
- Add `--metrics` flag.
- etcd uses default route IP if advertise URL is not given.
- Cluster rejects removing members if quorum will be lost.
- Discovery now has upper limit for waiting on retries.
- Warn on binding listeners through domain names; to be deprecated.
- v3.0 and v3.1 with `--auto-compaction-retention=10` run periodic compaction on v3 key-value store for every 10-hour.
- Compactor only supports periodic compaction.
- Compactor records latest revisions every 5-minute, until it reaches the first compaction period (e.g. 10-hour).
- In order to retain key-value history of last compaction period, it uses the last revision that was fetched before compaction period, from the revision records that were collected every 5-minute.
- When `--auto-compaction-retention=10`, compactor uses revision 100 for compact revision where revision 100 is the latest revision fetched from 10 hours ago.
- If compaction succeeds or requested revision has already been compacted, it resets period timer and starts over with new historical revision records (e.g. restart revision collect and compact for the next 10-hour period).
- If compaction fails, it retries in 5 minutes.
### client v3
- Add `SetEndpoints` method; update endpoints at runtime.
- Add `Sync` method; auto-update endpoints at runtime.
- Add `Lease TimeToLive` API; fetch lease information.
- replace Config.Logger field with global logger.
- Get API responses are sorted in ascending order by default.
### etcdctl v3
- Add `lease timetolive` command.
- Add `--print-value-only` flag to get command.
- Add `--dest-prefix` flag to make-mirror command.
- `get` command responses are sorted in ascending order by default.
### gRPC Proxy
- Experimental gRPC proxy feature.
### Other
- `recipes` now conform to sessions defined in `clientv3/concurrency`.
- ACI has symlinks to `/usr/local/bin/etcd*`.
### Go
- Compile with [*Go 1.7.4*](https://golang.org/doc/devel/release.html#go1.7).

View File

@ -1,665 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.1](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.1.md).
## [v3.2.24](https://github.com/coreos/etcd/releases/tag/v3.2.24) (2018-07-24)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.23...v3.2.24) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Improved
- Improve [Raft Read Index timeout warning messages](https://github.com/coreos/etcd/pull/9897).
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-2) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_go_version`](https://github.com/coreos/etcd/pull/9957) Prometheus metric.
- Add [`etcd_server_heartbeat_send_failures_total`](https://github.com/coreos/etcd/pull/9942) Prometheus metric.
- Add [`etcd_server_slow_apply_total`](https://github.com/coreos/etcd/pull/9942) Prometheus metric.
- Add [`etcd_disk_backend_defrag_duration_seconds`](https://github.com/coreos/etcd/pull/9942) Prometheus metric.
- Add [`etcd_mvcc_hash_duration_seconds`](https://github.com/coreos/etcd/pull/9942) Prometheus metric.
- Add [`etcd_server_slow_read_indexes_total`](https://github.com/coreos/etcd/pull/9897) Prometheus metric.
- Add [`etcd_server_quota_backend_bytes`](https://github.com/coreos/etcd/pull/9820) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
- Add [`etcd_mvcc_db_total_size_in_bytes`](https://github.com/coreos/etcd/pull/9819) Prometheus metric.
- In addition to [`etcd_debugging_mvcc_db_total_size_in_bytes`](https://github.com/coreos/etcd/pull/9819).
- Add [`etcd_mvcc_db_total_size_in_use_in_bytes`](https://github.com/coreos/etcd/pull/9256) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_server_quota_backend_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
### gRPC Proxy
- Add [flags for specifying TLS for connecting to proxy](https://github.com/coreos/etcd/pull/9894):
- Add `grpc-proxy start --cert-file`, `grpc-proxy start --key-file` and `grpc-proxy start --trusted-ca-file` flags.
- Add [`grpc-proxy start --metrics-addr` flag for specifying a separate metrics listen address](https://github.com/coreos/etcd/pull/9894).
### client v3
- Fix [lease keepalive interval updates when response queue is full](https://github.com/coreos/etcd/pull/9952).
- If `<-chan *clientv3LeaseKeepAliveResponse` from `clientv3.Lease.KeepAlive` was never consumed or channel is full, client was [sending keepalive request every 500ms](https://github.com/coreos/etcd/issues/9911) instead of expected rate of every "TTL / 3" duration.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.23](https://github.com/coreos/etcd/releases/tag/v3.2.23) (2018-06-15)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.22...v3.2.23) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Improved
- Improve [slow request apply warning log](https://github.com/coreos/etcd/pull/9288).
- e.g. `read-only range request "key:\"/a\" range_end:\"/b\" " with result "range_response_count:3 size:96" took too long (97.966µs) to execute`.
- Redact [request value field](https://github.com/coreos/etcd/pull/9822).
- Provide [response size](https://github.com/coreos/etcd/pull/9826).
- Add [backoff on watch retries on transient errors](https://github.com/coreos/etcd/pull/9840).
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-2) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_version`](https://github.com/coreos/etcd/pull/8960) Prometheus metric.
- To replace [Kubernetes `etcd-version-monitor`](https://github.com/coreos/etcd/issues/8948).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.22](https://github.com/coreos/etcd/releases/tag/v3.2.22) (2018-06-06)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.21...v3.2.22) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Security, Authentication
- Support TLS cipher suite whitelisting.
- To block [weak cipher suites](https://github.com/coreos/etcd/issues/8320).
- TLS handshake fails when client hello is requested with invalid cipher suites.
- Add [`etcd --cipher-suites`](https://github.com/coreos/etcd/pull/9801) flag.
- If empty, Go auto-populates the list.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.21](https://github.com/coreos/etcd/releases/tag/v3.2.21) (2018-05-31)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.20...v3.2.21) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### etcd server
- Fix [auth storage panic when simple token provider is disabled](https://github.com/coreos/etcd/pull/8695).
- Fix [`mvcc` server panic from restore operation](https://github.com/coreos/etcd/pull/9775).
- Let's assume that a watcher had been requested with a future revision X and sent to node A that became network-partitioned thereafter. Meanwhile, cluster makes progress. Then when the partition gets removed, the leader sends a snapshot to node A. Previously if the snapshot's latest revision is still lower than the watch revision X, **etcd server panicked** during snapshot restore operation.
- Now, this server-side panic has been fixed.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.20](https://github.com/coreos/etcd/releases/tag/v3.2.20) (2018-05-09)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.19...v3.2.20) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### etcd server
- Purge old [`*.snap.db` snapshot files](https://github.com/coreos/etcd/pull/7967).
- Previously, etcd did not respect `--max-snapshots` flag to purge old `*.snap.db` files.
- Now, etcd purges old `*.snap.db` files to keep maximum `--max-snapshots` number of files on disk.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.19](https://github.com/coreos/etcd/releases/tag/v3.2.19) (2018-04-24)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.18...v3.2.19) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-2) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Fix [`etcd_debugging_server_lease_expired_total`](https://github.com/coreos/etcd/pull/9557) Prometheus metric.
- Fix [race conditions in v2 server stat collecting](https://github.com/coreos/etcd/pull/9562).
- Add [`etcd_server_is_leader`](https://github.com/coreos/etcd/pull/9587) Prometheus metric.
### Security, Authentication
- Fix [TLS reload](https://github.com/coreos/etcd/pull/9570) when [certificate SAN field only includes IP addresses but no domain names](https://github.com/coreos/etcd/issues/9541).
- In Go, server calls `(*tls.Config).GetCertificate` for TLS reload if and only if server's `(*tls.Config).Certificates` field is not empty, or `(*tls.ClientHelloInfo).ServerName` is not empty with a valid SNI from the client. Previously, etcd always populates `(*tls.Config).Certificates` on the initial client TLS handshake, as non-empty. Thus, client was always expected to supply a matching SNI in order to pass the TLS verification and to trigger `(*tls.Config).GetCertificate` to reload TLS assets.
- However, a certificate whose SAN field does [not include any domain names but only IP addresses](https://github.com/coreos/etcd/issues/9541) would request `*tls.ClientHelloInfo` with an empty `ServerName` field, thus failing to trigger the TLS reload on initial TLS handshake; this becomes a problem when expired certificates need to be replaced online.
- Now, `(*tls.Config).Certificates` is created empty on initial TLS client handshake, first to trigger `(*tls.Config).GetCertificate`, and then to populate rest of the certificates on every new TLS connection, even when client SNI is empty (e.g. cert only includes IPs).
### etcd server
- Add [`etcd --initial-election-tick-advance`](https://github.com/coreos/etcd/pull/9591) flag to configure initial election tick fast-forward.
- By default, `etcd --initial-election-tick-advance=true`, then local member fast-forwards election ticks to speed up "initial" leader election trigger.
- This benefits the case of larger election ticks. For instance, cross datacenter deployment may require longer election timeout of 10-second. If true, local node does not need wait up to 10-second. Instead, forwards its election ticks to 8-second, and have only 2-second left before leader election.
- Major assumptions are that: cluster has no active leader thus advancing ticks enables faster leader election. Or cluster already has an established leader, and rejoining follower is likely to receive heartbeats from the leader after tick advance and before election timeout.
- However, when network from leader to rejoining follower is congested, and the follower does not receive leader heartbeat within left election ticks, disruptive election has to happen thus affecting cluster availabilities.
- Now, this can be disabled by setting `--initial-election-tick-advance=false`.
- Disabling this would slow down initial bootstrap process for cross datacenter deployments. Make tradeoffs by configuring `--initial-election-tick-advance` at the cost of slow initial bootstrap.
- If single-node, it advances ticks regardless.
- Address [disruptive rejoining follower node](https://github.com/coreos/etcd/issues/9333).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.18](https://github.com/coreos/etcd/releases/tag/v3.2.18) (2018-03-29)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.17...v3.2.18) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Improved
- Adjust [election timeout on server restart](https://github.com/coreos/etcd/pull/9415) to reduce [disruptive rejoining servers](https://github.com/coreos/etcd/issues/9333).
- Previously, etcd fast-forwards election ticks on server start, with only one tick left for leader election. This is to speed up start phase, without having to wait until all election ticks elapse. Advancing election ticks is useful for cross datacenter deployments with larger election timeouts. However, it was affecting cluster availability if the last tick elapses before leader contacts the restarted node.
- Now, when etcd restarts, it adjusts election ticks with more than one tick left, thus more time for leader to prevent disruptive restart.
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-2) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add missing [`etcd_network_peer_sent_failures_total` count](https://github.com/coreos/etcd/pull/9437).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.17](https://github.com/coreos/etcd/releases/tag/v3.2.17) (2018-03-08)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.16...v3.2.17) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### etcd server
- Fix [server panic on invalid Election Proclaim/Resign HTTP(S) requests](https://github.com/coreos/etcd/pull/9379).
- Previously, wrong-formatted HTTP requests to Election API could trigger panic in etcd server.
- e.g. `curl -L http://localhost:2379/v3/election/proclaim -X POST -d '{"value":""}'`, `curl -L http://localhost:2379/v3/election/resign -X POST -d '{"value":""}'`.
- Prevent [overflow by large `TTL` values for `Lease` `Grant`](https://github.com/coreos/etcd/pull/9399).
- `TTL` parameter to `Grant` request is unit of second.
- Leases with too large `TTL` values exceeding `math.MaxInt64` [expire in unexpected ways](https://github.com/coreos/etcd/issues/9374).
- Server now returns `rpctypes.ErrLeaseTTLTooLarge` to client, when the requested `TTL` is larger than *9,000,000,000 seconds* (which is >285 years).
- Again, etcd `Lease` is meant for short-periodic keepalives or sessions, in the range of seconds or minutes. Not for hours or days!
- Enable etcd server [`raft.Config.CheckQuorum` when starting with `ForceNewCluster`](https://github.com/coreos/etcd/pull/9347).
### Proxy v2
- Fix [v2 proxy leaky HTTP requests](https://github.com/coreos/etcd/pull/9336).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.16](https://github.com/coreos/etcd/releases/tag/v3.2.16) (2018-02-12)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.15...v3.2.16) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### etcd server
- Fix [`mvcc` "unsynced" watcher restore operation](https://github.com/coreos/etcd/pull/9297).
- "unsynced" watcher is watcher that needs to be in sync with events that have happened.
- That is, "unsynced" watcher is the slow watcher that was requested on old revision.
- "unsynced" watcher restore operation was not correctly populating its underlying watcher group.
- Which possibly causes [missing events from "unsynced" watchers](https://github.com/coreos/etcd/issues/9086).
- A node gets network partitioned with a watcher on a future revision, and falls behind receiving a leader snapshot after partition gets removed. When applying this snapshot, etcd watch storage moves current synced watchers to unsynced since sync watchers might have become stale during network partition. And reset synced watcher group to restart watcher routines. Previously, there was a bug when moving from synced watcher group to unsynced, thus client would miss events when the watcher was requested to the network-partitioned node.
### Go
- Compile with [*Go 1.8.5*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.15](https://github.com/coreos/etcd/releases/tag/v3.2.15) (2018-01-22)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.14...v3.2.15) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### etcd server
- Prevent [server panic from member update/add](https://github.com/coreos/etcd/pull/9174) with [wrong scheme URLs](https://github.com/coreos/etcd/issues/9173).
- Log [user context cancel errors on stream APIs in debug level with TLS](https://github.com/coreos/etcd/pull/9178).
### Go
- Compile with [*Go 1.8.5*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.14](https://github.com/coreos/etcd/releases/tag/v3.2.14) (2018-01-11)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.13...v3.2.14) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Improved
- Log [user context cancel errors on stream APIs in debug level](https://github.com/coreos/etcd/pull/9105).
### etcd server
- Fix [`mvcc/backend.defragdb` nil-pointer dereference on create bucket failure](https://github.com/coreos/etcd/pull/9119).
### Go
- Compile with [*Go 1.8.5*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.13](https://github.com/coreos/etcd/releases/tag/v3.2.13) (2018-01-02)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.12...v3.2.13) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### etcd server
- Remove [verbose error messages on stream cancel and gRPC info-level logs](https://github.com/coreos/etcd/pull/9080) in server-side.
- Fix [gRPC server panic on `GracefulStop` TLS-enabled server](https://github.com/coreos/etcd/pull/8987).
### Go
- Compile with [*Go 1.8.5*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.12](https://github.com/coreos/etcd/releases/tag/v3.2.12) (2017-12-20)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.11...v3.2.12) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Dependency
- Upgrade [`google.golang.org/grpc`](https://github.com/grpc/grpc-go/releases/tag) from [**`v1.7.4`**](https://github.com/grpc/grpc-go/releases/tag/v1.7.4) to [**`v1.7.5`**](https://github.com/grpc/grpc-go/releases/tag/v1.7.5).
- Upgrade [`github.com/grpc-ecosystem/grpc-gateway`](https://github.com/grpc-ecosystem/grpc-gateway/releases) from [**`v1.3`**](https://github.com/grpc-ecosystem/grpc-gateway/releases/tag/v1.3) to [**`v1.3.0`**](https://github.com/grpc-ecosystem/grpc-gateway/releases/tag/v1.3.0).
### etcd server
- Fix [error message of `Revision` compactor](https://github.com/coreos/etcd/pull/8999) in server-side.
### client v3
- Add [`MaxCallSendMsgSize` and `MaxCallRecvMsgSize`](https://github.com/coreos/etcd/pull/9047) fields to [`clientv3.Config`](https://godoc.org/github.com/coreos/etcd/clientv3#Config).
- Fix [exceeded response size limit error in client-side](https://github.com/coreos/etcd/issues/9043).
- Address [kubernetes#51099](https://github.com/kubernetes/kubernetes/issues/51099).
- In previous versions(v3.2.10, v3.2.11), client response size was limited to only 4 MiB.
- `MaxCallSendMsgSize` default value is 2 MiB, if not configured.
- `MaxCallRecvMsgSize` default value is `math.MaxInt32`, if not configured.
### Go
- Compile with [*Go 1.8.5*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.11](https://github.com/coreos/etcd/releases/tag/v3.2.11) (2017-12-05)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.10...v3.2.11) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Dependency
- Upgrade [`google.golang.org/grpc`](https://github.com/grpc/grpc-go/releases/tag) from [**`v1.7.3`**](https://github.com/grpc/grpc-go/releases/tag/v1.7.3) to [**`v1.7.4`**](https://github.com/grpc/grpc-go/releases/tag/v1.7.4).
### Security, Authentication
See [security doc](https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md) for more details.
- Log [more details on TLS handshake failures](https://github.com/coreos/etcd/pull/8952/files).
### client v3
- Fix racey grpc-go's server handler transport `WriteStatus` call to prevent [TLS-enabled etcd server crash](https://github.com/coreos/etcd/issues/8904).
- Add [gRPC RPC failure warnings](https://github.com/coreos/etcd/pull/8939) to help debug such issues in the future.
### Documentation
- Remove `--listen-metrics-urls` flag in monitoring document (non-released in `v3.2.x`, planned for `v3.3.x`).
### Go
- Compile with [*Go 1.8.5*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.10](https://github.com/coreos/etcd/releases/tag/v3.2.10) (2017-11-16)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.9...v3.2.10) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Dependency
- Upgrade [`google.golang.org/grpc`](https://github.com/grpc/grpc-go/releases/tag) from [**`v1.2.1`**](https://github.com/grpc/grpc-go/releases/tag/v1.2.1) to [**`v1.7.3`**](https://github.com/grpc/grpc-go/releases/tag/v1.7.3).
- Upgrade [`github.com/grpc-ecosystem/grpc-gateway`](https://github.com/grpc-ecosystem/grpc-gateway/releases) from [**`v1.2.0`**](https://github.com/grpc-ecosystem/grpc-gateway/releases/tag/v1.2.0) to [**`v1.3`**](https://github.com/grpc-ecosystem/grpc-gateway/releases/tag/v1.3).
### Security, Authentication
See [security doc](https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md) for more details.
- Revert [discovery SRV auth `ServerName` with `*.{ROOT_DOMAIN}`](https://github.com/coreos/etcd/pull/8651) to support non-wildcard subject alternative names in the certs (see [issue #8445](https://github.com/coreos/etcd/issues/8445) for more contexts).
- For instance, `etcd --discovery-srv=etcd.local` will only authenticate peers/clients when the provided certs have root domain `etcd.local` (**not `*.etcd.local`**) as an entry in Subject Alternative Name (SAN) field.
### etcd server
- Replace backend key-value database `boltdb/bolt` with [`coreos/bbolt`](https://github.com/coreos/bbolt/releases) to address [backend database size issue](https://github.com/coreos/etcd/issues/8009).
### client v3
- Rewrite balancer to handle [network partitions](https://github.com/coreos/etcd/issues/8711).
### Go
- Compile with [*Go 1.8.5*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.9](https://github.com/coreos/etcd/releases/tag/v3.2.9) (2017-10-06)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.8...v3.2.9) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Security, Authentication
See [security doc](https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md) for more details.
- Update `golang.org/x/crypto/bcrypt` (see [golang/crypto@6c586e1](https://github.com/golang/crypto/commit/6c586e17d90a7d08bbbc4069984180dce3b04117)).
- Fix discovery SRV bootstrapping to [authenticate `ServerName` with `*.{ROOT_DOMAIN}`](https://github.com/coreos/etcd/pull/8651), in order to support sub-domain wildcard matching (see [issue #8445](https://github.com/coreos/etcd/issues/8445) for more contexts).
- For instance, `etcd --discovery-srv=etcd.local` will only authenticate peers/clients when the provided certs have root domain `*.etcd.local` as an entry in Subject Alternative Name (SAN) field.
### Go
- Compile with [*Go 1.8.4*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.8](https://github.com/coreos/etcd/releases/tag/v3.2.8) (2017-09-29)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.7...v3.2.8) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### client v2
- Fix v2 client failover to next endpoint on mutable operation.
### gRPC Proxy
- Handle [`KeysOnly` flag](https://github.com/coreos/etcd/pull/8552).
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.7](https://github.com/coreos/etcd/releases/tag/v3.2.7) (2017-09-01)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.6...v3.2.7) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Security, Authentication
- Fix [server-side auth so concurrent auth operations do not return old revision error](https://github.com/coreos/etcd/pull/8306).
### client v3
- Fix [`concurrency/stm` Put with serializable snapshot](https://github.com/coreos/etcd/pull/8439).
- Use store revision from first fetch to resolve write conflicts instead of modified revision.
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.6](https://github.com/coreos/etcd/releases/tag/v3.2.6) (2017-08-21)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.5...v3.2.6) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### etcd server
- Fix watch restore from snapshot.
- Fix multiple URLs for `--listen-peer-urls` flag.
- Add `--enable-pprof` flag to etcd configuration file format.
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-2) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Fix `etcd_debugging_mvcc_keys_total` inconsistency.
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.5](https://github.com/coreos/etcd/releases/tag/v3.2.5) (2017-08-04)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.4...v3.2.5) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### etcdctl v3
- Return non-zero exit code on unhealthy `endpoint health`.
### Security, Authentication
See [security doc](https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md) for more details.
- [Server supports reverse-lookup on wildcard DNS `SAN`](https://github.com/coreos/etcd/pull/8281). For instance, if peer cert contains only DNS names (no IP addresses) in Subject Alternative Name (SAN) field, server first reverse-lookups the remote IP address to get a list of names mapping to that address (e.g. `nslookup IPADDR`). Then accepts the connection if those names have a matching name with peer cert's DNS names (either by exact or wildcard match). If none is matched, server forward-lookups each DNS entry in peer cert (e.g. look up `example.default.svc` when the entry is `*.example.default.svc`), and accepts connection only when the host's resolved addresses have the matching IP address with the peer's remote IP address. For example, peer B's CSR (with `cfssl`) SAN field is `["*.example.default.svc", "*.example.default.svc.cluster.local"]` when peer B's remote IP address is `10.138.0.2`. When peer B tries to join the cluster, peer A reverse-lookup the IP `10.138.0.2` to get the list of host names. And either exact or wildcard match the host names with peer B's cert DNS names in Subject Alternative Name (SAN) field. If none of reverse/forward lookups worked, it returns an error `"tls: "10.138.0.2" does not match any of DNSNames ["*.example.default.svc","*.example.default.svc.cluster.local"]`. See [issue#8268](https://github.com/coreos/etcd/issues/8268) for more detail.
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-2) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Fix unreachable `/metrics` endpoint when `--enable-v2=false`.
### gRPC Proxy
- Handle [`PrevKv` flag](https://github.com/coreos/etcd/pull/8366).
### Other
- Add container registry `gcr.io/etcd-development/etcd`.
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.4](https://github.com/coreos/etcd/releases/tag/v3.2.4) (2017-07-19)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.3...v3.2.4) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### etcd server
- Do not block on active client stream when stopping server
### gRPC proxy
- Fix gRPC proxy Snapshot RPC error handling
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.3](https://github.com/coreos/etcd/releases/tag/v3.2.3) (2017-07-14)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.2...v3.2.3) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### client v3
- Let clients establish unlimited streams
### Other
- Tag docker images with minor versions
- e.g. `docker pull quay.io/coreos/etcd:v3.2` to fetch latest v3.2 versions
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.2](https://github.com/coreos/etcd/releases/tag/v3.2.2) (2017-07-07)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.1...v3.2.2) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Improved
- Rate-limit lease revoke on expiration.
- Extend leases on promote to avoid queueing effect on lease expiration.
### Security, Authentication
See [security doc](https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md) for more details.
- [Server accepts connections if IP matches, without checking DNS entries](https://github.com/coreos/etcd/pull/8223). For instance, if peer cert contains IP addresses and DNS names in Subject Alternative Name (SAN) field, and the remote IP address matches one of those IP addresses, server just accepts connection without further checking the DNS names. For example, peer B's CSR (with `cfssl`) SAN field is `["invalid.domain", "10.138.0.2"]` when peer B's remote IP address is `10.138.0.2` and `invalid.domain` is a invalid host. When peer B tries to join the cluster, peer A successfully authenticates B, since Subject Alternative Name (SAN) field has a valid matching IP address. See [issue#8206](https://github.com/coreos/etcd/issues/8206) for more detail.
### etcd server
- Accept connection with matched IP SAN but no DNS match.
- Don't check DNS entries in certs if there's a matching IP.
### gRPC gateway
- Use user-provided listen address to connect to gRPC gateway.
- `net.Listener` rewrites IPv4 0.0.0.0 to IPv6 [::], breaking IPv6 disabled hosts.
- Only v3.2.0, v3.2.1 are affected.
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.1](https://github.com/coreos/etcd/releases/tag/v3.2.1) (2017-06-23)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.0...v3.2.1) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### etcd server
- Fix backend database in-memory index corruption issue on restore (only 3.2.0 is affected).
### gRPC gateway
- Fix Txn marshaling.
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-2) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Fix backend database size debugging metrics.
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
## [v3.2.0](https://github.com/coreos/etcd/releases/tag/v3.2.0) (2017-06-09)
See [code changes](https://github.com/coreos/etcd/compare/v3.1.0...v3.2.0) and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.2 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md).**
### Improved
- Improve backend read concurrency.
### Breaking Changes
- Increased [`--snapshot-count` default value from 10,000 to 100,000](https://github.com/coreos/etcd/pull/7160).
- Higher snapshot count means it holds Raft entries in memory for longer before discarding old entries.
- It is a trade-off between less frequent snapshotting and [higher memory usage](https://github.com/kubernetes/kubernetes/issues/60589#issuecomment-371977156).
- User lower `--snapshot-count` value for lower memory usage.
- User higher `--snapshot-count` value for better availabilities of slow followers (less frequent snapshots from leader).
- `clientv3.Lease.TimeToLive` returns `LeaseTimeToLiveResponse.TTL == -1` on lease not found.
- `clientv3.NewFromConfigFile` is moved to `clientv3/yaml.NewConfig`.
- `embed.Etcd.Peers` field is now `[]*peerListener`.
- Rejects domains names for `--listen-peer-urls` and `--listen-client-urls` (3.1 only prints out warnings), since [domain name is invalid for network interface binding](https://github.com/coreos/etcd/issues/6336).
### Dependency
- Upgrade [`google.golang.org/grpc`](https://github.com/grpc/grpc-go/releases) from [**`v1.0.4`**](https://github.com/grpc/grpc-go/releases/tag/v1.0.4) to [**`v1.2.1`**](https://github.com/grpc/grpc-go/releases/tag/v1.2.1).
- Upgrade [`github.com/grpc-ecosystem/grpc-gateway`](https://github.com/grpc-ecosystem/grpc-gateway/releases) to [**`v1.2.0`**](https://github.com/grpc-ecosystem/grpc-gateway/releases/tag/v1.2.0).
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-2) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_disk_backend_snapshot_duration_seconds`](https://github.com/coreos/etcd/pull/7892)
- Add `etcd_debugging_server_lease_expired_total` metrics.
### Security, Authentication
See [security doc](https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md) for more details.
- [TLS certificates get reloaded on every client connection](https://github.com/coreos/etcd/pull/7829). This is useful when replacing expiry certs without stopping etcd servers; it can be done by overwriting old certs with new ones. Refreshing certs for every connection should not have too much overhead, but can be improved in the future, with caching layer. Example tests can be found [here](https://github.com/coreos/etcd/blob/b041ce5d514a4b4aaeefbffb008f0c7570a18986/integration/v3_grpc_test.go#L1601-L1757).
- [Server denies incoming peer certs with wrong IP `SAN`](https://github.com/coreos/etcd/pull/7687). For instance, if peer cert contains any IP addresses in Subject Alternative Name (SAN) field, server authenticates a peer only when the remote IP address matches one of those IP addresses. This is to prevent unauthorized endpoints from joining the cluster. For example, peer B's CSR (with `cfssl`) SAN field is `["*.example.default.svc", "*.example.default.svc.cluster.local", "10.138.0.27"]` when peer B's actual IP address is `10.138.0.2`, not `10.138.0.27`. When peer B tries to join the cluster, peer A will reject B with the error `x509: certificate is valid for 10.138.0.27, not 10.138.0.2`, because B's remote IP address does not match the one in Subject Alternative Name (SAN) field.
- [Server resolves TLS `DNSNames` when checking `SAN`](https://github.com/coreos/etcd/pull/7767). For instance, if peer cert contains only DNS names (no IP addresses) in Subject Alternative Name (SAN) field, server authenticates a peer only when forward-lookups (`dig b.com`) on those DNS names have matching IP with the remote IP address. For example, peer B's CSR (with `cfssl`) SAN field is `["b.com"]` when peer B's remote IP address is `10.138.0.2`. When peer B tries to join the cluster, peer A looks up the incoming host `b.com` to get the list of IP addresses (e.g. `dig b.com`). And rejects B if the list does not contain the IP `10.138.0.2`, with the error `tls: 10.138.0.2 does not match any of DNSNames ["b.com"]`.
- Auth support JWT token.
### etcd server
- RPCs
- Add Election, Lock service.
- Native client `etcdserver/api/v3client`
- client "embedded" in the server.
- Logging, monitoring
- Server warns large snapshot operations.
- Add `etcd --enable-v2` flag to enable v2 API server.
- `etcd --enable-v2=true` by default.
- Add `etcd --auth-token` flag.
- v3.2 compactor runs [every hour](https://github.com/coreos/etcd/pull/7875).
- Compactor only supports periodic compaction.
- Compactor continues to record latest revisions every 5-minute.
- For every hour, it uses the last revision that was fetched before compaction period, from the revision records that were collected every 5-minute.
- That is, for every hour, compactor discards historical data created before compaction period.
- The retention window of compaction period moves to next hour.
- For instance, when hourly writes are 100 and `--auto-compaction-retention=10`, v3.1 compacts revision 1000, 2000, and 3000 for every 10-hour, while v3.2 compacts revision 1000, 1100, and 1200 for every 1-hour.
- If compaction succeeds or requested revision has already been compacted, it resets period timer and removes used compacted revision from historical revision records (e.g. start next revision collect and compaction from previously collected revisions).
- If compaction fails, it retries in 5 minutes.
- Allow snapshot over 512MB.
### client v3
- STM prefetching.
- Add namespace feature.
- Add `ErrOldCluster` with server version checking.
- Translate `WithPrefix()` into `WithFromKey()` for empty key.
### etcdctl v3
- Add `check perf` command.
- Add `etcdctl --from-key` flag to role grant-permission command.
- `lock` command takes an optional command to execute.
### gRPC Proxy
- Proxy endpoint discovery.
- Namespaces.
- Coalesce lease requests.
### etcd gateway
- Support [DNS SRV priority](https://github.com/coreos/etcd/pull/7882) for [smart proxy routing](https://github.com/coreos/etcd/issues/4378).
### Other
- v3 client
- concurrency package's elections updated to match RPC interfaces.
- let client dial endpoints not in the balancer.
- Release
- Annotate acbuild with supports-systemd-notify.
- Add `nsswitch.conf` to Docker container image.
- Add ppc64le, arm64(experimental) builds.
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).

View File

@ -1,521 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.2](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.2.md).
## [v3.3.9](https://github.com/coreos/etcd/releases/tag/v3.3.9) (2018-07-24)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.8...v3.3.9) and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md).**
### Improved
- Improve [Raft Read Index timeout warning messages](https://github.com/coreos/etcd/pull/9897).
### Security, Authentication
- Compile with [*Go 1.10.3*](https://golang.org/doc/devel/release.html#go1.10) to support [crypto/x509 "Name Constraints"](https://github.com/coreos/etcd/issues/9912).
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-3) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_go_version`](https://github.com/coreos/etcd/pull/9957) Prometheus metric.
- Add [`etcd_server_heartbeat_send_failures_total`](https://github.com/coreos/etcd/pull/9940) Prometheus metric.
- Add [`etcd_server_slow_apply_total`](https://github.com/coreos/etcd/pull/9940) Prometheus metric.
- Add [`etcd_disk_backend_defrag_duration_seconds`](https://github.com/coreos/etcd/pull/9940) Prometheus metric.
- Add [`etcd_mvcc_hash_duration_seconds`](https://github.com/coreos/etcd/pull/9940) Prometheus metric.
- Add [`etcd_mvcc_hash_rev_duration_seconds`](https://github.com/coreos/etcd/pull/9940) Prometheus metric.
- Add [`etcd_server_slow_read_indexes_total`](https://github.com/coreos/etcd/pull/9897) Prometheus metric.
- Add [`etcd_server_quota_backend_bytes`](https://github.com/coreos/etcd/pull/9820) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
- Add [`etcd_mvcc_db_total_size_in_bytes`](https://github.com/coreos/etcd/pull/9819) Prometheus metric.
- In addition to [`etcd_debugging_mvcc_db_total_size_in_bytes`](https://github.com/coreos/etcd/pull/9819).
- Add [`etcd_mvcc_db_total_size_in_use_in_bytes`](https://github.com/coreos/etcd/pull/9256) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
### client v3
- Fix [lease keepalive interval updates when response queue is full](https://github.com/coreos/etcd/pull/9952).
- If `<-chan *clientv3LeaseKeepAliveResponse` from `clientv3.Lease.KeepAlive` was never consumed or channel is full, client was [sending keepalive request every 500ms](https://github.com/coreos/etcd/issues/9911) instead of expected rate of every "TTL / 3" duration.
### Go
- Compile with [*Go 1.10.3*](https://golang.org/doc/devel/release.html#go1.10).
## [v3.3.8](https://github.com/coreos/etcd/releases/tag/v3.3.8) (2018-06-15)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.7...v3.3.8) and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md).**
### Improved
- Improve [slow request apply warning log](https://github.com/coreos/etcd/pull/9288).
- e.g. `read-only range request "key:\"/a\" range_end:\"/b\" " with result "range_response_count:3 size:96" took too long (97.966µs) to execute`.
- Redact [request value field](https://github.com/coreos/etcd/pull/9822).
- Provide [response size](https://github.com/coreos/etcd/pull/9826).
- Add [backoff on watch retries on transient errors](https://github.com/coreos/etcd/pull/9840).
### Go
- Compile with [*Go 1.9.7*](https://golang.org/doc/devel/release.html#go1.9).
## [v3.3.7](https://github.com/coreos/etcd/releases/tag/v3.3.7) (2018-06-06)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.6...v3.3.7) and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md).**
### Security, Authentication
- Support TLS cipher suite whitelisting.
- To block [weak cipher suites](https://github.com/coreos/etcd/issues/8320).
- TLS handshake fails when client hello is requested with invalid cipher suites.
- Add [`etcd --cipher-suites`](https://github.com/coreos/etcd/pull/9801) flag.
- If empty, Go auto-populates the list.
### etcdctl v3
- Fix [`etcdctl move-leader` command for TLS-enabled endpoints](https://github.com/coreos/etcd/pull/9807).
### Go
- Compile with [*Go 1.9.6*](https://golang.org/doc/devel/release.html#go1.9).
## [v3.3.6](https://github.com/coreos/etcd/releases/tag/v3.3.6) (2018-05-31)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.5...v3.3.6) and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md).**
### etcd server
- Allow [empty auth token](https://github.com/coreos/etcd/pull/9369).
- Previously, when auth token is an empty string, it returns [`failed to initialize the etcd server: auth: invalid auth options` error](https://github.com/coreos/etcd/issues/9349).
- Fix [auth storage panic on server lease revoke routine with JWT token](https://github.com/coreos/etcd/issues/9695).
- Fix [`mvcc` server panic from restore operation](https://github.com/coreos/etcd/pull/9775).
- Let's assume that a watcher had been requested with a future revision X and sent to node A that became network-partitioned thereafter. Meanwhile, cluster makes progress. Then when the partition gets removed, the leader sends a snapshot to node A. Previously if the snapshot's latest revision is still lower than the watch revision X, **etcd server panicked** during snapshot restore operation.
- Now, this server-side panic has been fixed.
### Go
- Compile with [*Go 1.9.6*](https://golang.org/doc/devel/release.html#go1.9).
## [v3.3.5](https://github.com/coreos/etcd/releases/tag/v3.3.5) (2018-05-09)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.4...v3.3.5) and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md).**
### etcdctl v3
- Fix [`etcdctl watch [key] [range_end] -- [exec-command…]`](https://github.com/coreos/etcd/pull/9688) parsing.
- Previously, `ETCDCTL_API=3 ./bin/etcdctl watch foo -- echo watch event received` panicked.
### Go
- Compile with [*Go 1.9.6*](https://golang.org/doc/devel/release.html#go1.9).
## [v3.3.4](https://github.com/coreos/etcd/releases/tag/v3.3.4) (2018-04-24)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.3...v3.3.4) and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md).**
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-3) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_is_leader`](https://github.com/coreos/etcd/pull/9587) Prometheus metric.
- Fix [`etcd_debugging_server_lease_expired_total`](https://github.com/coreos/etcd/pull/9557) Prometheus metric.
- Fix [race conditions in v2 server stat collecting](https://github.com/coreos/etcd/pull/9562).
### Security, Authentication
- Fix [TLS reload](https://github.com/coreos/etcd/pull/9570) when [certificate SAN field only includes IP addresses but no domain names](https://github.com/coreos/etcd/issues/9541).
- In Go, server calls `(*tls.Config).GetCertificate` for TLS reload if and only if server's `(*tls.Config).Certificates` field is not empty, or `(*tls.ClientHelloInfo).ServerName` is not empty with a valid SNI from the client. Previously, etcd always populates `(*tls.Config).Certificates` on the initial client TLS handshake, as non-empty. Thus, client was always expected to supply a matching SNI in order to pass the TLS verification and to trigger `(*tls.Config).GetCertificate` to reload TLS assets.
- However, a certificate whose SAN field does [not include any domain names but only IP addresses](https://github.com/coreos/etcd/issues/9541) would request `*tls.ClientHelloInfo` with an empty `ServerName` field, thus failing to trigger the TLS reload on initial TLS handshake; this becomes a problem when expired certificates need to be replaced online.
- Now, `(*tls.Config).Certificates` is created empty on initial TLS client handshake, first to trigger `(*tls.Config).GetCertificate`, and then to populate rest of the certificates on every new TLS connection, even when client SNI is empty (e.g. cert only includes IPs).
### etcd server
- Add [`etcd --initial-election-tick-advance`](https://github.com/coreos/etcd/pull/9591) flag to configure initial election tick fast-forward.
- By default, `etcd --initial-election-tick-advance=true`, then local member fast-forwards election ticks to speed up "initial" leader election trigger.
- This benefits the case of larger election ticks. For instance, cross datacenter deployment may require longer election timeout of 10-second. If true, local node does not need wait up to 10-second. Instead, forwards its election ticks to 8-second, and have only 2-second left before leader election.
- Major assumptions are that: cluster has no active leader thus advancing ticks enables faster leader election. Or cluster already has an established leader, and rejoining follower is likely to receive heartbeats from the leader after tick advance and before election timeout.
- However, when network from leader to rejoining follower is congested, and the follower does not receive leader heartbeat within left election ticks, disruptive election has to happen thus affecting cluster availabilities.
- Now, this can be disabled by setting `--initial-election-tick-advance=false`.
- Disabling this would slow down initial bootstrap process for cross datacenter deployments. Make tradeoffs by configuring `etcd --initial-election-tick-advance` at the cost of slow initial bootstrap.
- If single-node, it advances ticks regardless.
- Address [disruptive rejoining follower node](https://github.com/coreos/etcd/issues/9333).
### Package `embed`
- Add [`embed.Config.InitialElectionTickAdvance`](https://github.com/coreos/etcd/pull/9591) to enable/disable initial election tick fast-forward.
- `embed.NewConfig()` would return `*embed.Config` with `InitialElectionTickAdvance` as true by default.
### Go
- Compile with [*Go 1.9.5*](https://golang.org/doc/devel/release.html#go1.9).
## [v3.3.3](https://github.com/coreos/etcd/releases/tag/v3.3.3) (2018-03-29)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.2...v3.3.3) and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md).**
### Improved
- Adjust [election timeout on server restart](https://github.com/coreos/etcd/pull/9415) to reduce [disruptive rejoining servers](https://github.com/coreos/etcd/issues/9333).
- Previously, etcd fast-forwards election ticks on server start, with only one tick left for leader election. This is to speed up start phase, without having to wait until all election ticks elapse. Advancing election ticks is useful for cross datacenter deployments with larger election timeouts. However, it was affecting cluster availability if the last tick elapses before leader contacts the restarted node.
- Now, when etcd restarts, it adjusts election ticks with more than one tick left, thus more time for leader to prevent disruptive restart.
- Adjust [periodic compaction retention window](https://github.com/coreos/etcd/pull/9485).
- e.g. `etcd --auto-compaction-mode=revision --auto-compaction-retention=1000` automatically `Compact` on `"latest revision" - 1000` every 5-minute (when latest revision is 30000, compact on revision 29000).
- e.g. Previously, `etcd --auto-compaction-mode=periodic --auto-compaction-retention=72h` automatically `Compact` with 72-hour retention windown for every 7.2-hour. **Now, `Compact` happens, for every 1-hour but still with 72-hour retention window.**
- e.g. Previously, `etcd --auto-compaction-mode=periodic --auto-compaction-retention=30m` automatically `Compact` with 30-minute retention windown for every 3-minute. **Now, `Compact` happens, for every 30-minute but still with 30-minute retention window.**
- Periodic compactor keeps recording latest revisions for every compaction period when given period is less than 1-hour, or for every 1-hour when given compaction period is greater than 1-hour (e.g. 1-hour when `etcd --auto-compaction-mode=periodic --auto-compaction-retention=24h`).
- For every compaction period or 1-hour, compactor uses the last revision that was fetched before compaction period, to discard historical data.
- The retention window of compaction period moves for every given compaction period or hour.
- For instance, when hourly writes are 100 and `etcd --auto-compaction-mode=periodic --auto-compaction-retention=24h`, `v3.2.x`, `v3.3.0`, `v3.3.1`, and `v3.3.2` compact revision 2400, 2640, and 2880 for every 2.4-hour, while `v3.3.3` *or later* compacts revision 2400, 2500, 2600 for every 1-hour.
- Futhermore, when `etcd --auto-compaction-mode=periodic --auto-compaction-retention=30m` and writes per minute are about 1000, `v3.3.0`, `v3.3.1`, and `v3.3.2` compact revision 30000, 33000, and 36000, for every 3-minute, while `v3.3.3` *or later* compacts revision 30000, 60000, and 90000, for every 30-minute.
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-3) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add missing [`etcd_network_peer_sent_failures_total` count](https://github.com/coreos/etcd/pull/9437).
### Go
- Compile with [*Go 1.9.5*](https://golang.org/doc/devel/release.html#go1.9).
## [v3.3.2](https://github.com/coreos/etcd/releases/tag/v3.3.2) (2018-03-08)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.1...v3.3.2) and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md).**
### etcd server
- Fix [server panic on invalid Election Proclaim/Resign HTTP(S) requests](https://github.com/coreos/etcd/pull/9379).
- Previously, wrong-formatted HTTP requests to Election API could trigger panic in etcd server.
- e.g. `curl -L http://localhost:2379/v3/election/proclaim -X POST -d '{"value":""}'`, `curl -L http://localhost:2379/v3/election/resign -X POST -d '{"value":""}'`.
- Fix [revision-based compaction retention parsing](https://github.com/coreos/etcd/pull/9339).
- Previously, `etcd --auto-compaction-mode revision --auto-compaction-retention 1` was [translated to revision retention 3600000000000](https://github.com/coreos/etcd/issues/9337).
- Now, `etcd --auto-compaction-mode revision --auto-compaction-retention 1` is correctly parsed as revision retention 1.
- Prevent [overflow by large `TTL` values for `Lease` `Grant`](https://github.com/coreos/etcd/pull/9399).
- `TTL` parameter to `Grant` request is unit of second.
- Leases with too large `TTL` values exceeding `math.MaxInt64` [expire in unexpected ways](https://github.com/coreos/etcd/issues/9374).
- Server now returns `rpctypes.ErrLeaseTTLTooLarge` to client, when the requested `TTL` is larger than *9,000,000,000 seconds* (which is >285 years).
- Again, etcd `Lease` is meant for short-periodic keepalives or sessions, in the range of seconds or minutes. Not for hours or days!
- Enable etcd server [`raft.Config.CheckQuorum` when starting with `ForceNewCluster`](https://github.com/coreos/etcd/pull/9347).
### Proxy v2
- Fix [v2 proxy leaky HTTP requests](https://github.com/coreos/etcd/pull/9336).
### Go
- Compile with [*Go 1.9.4*](https://golang.org/doc/devel/release.html#go1.9).
## [v3.3.1](https://github.com/coreos/etcd/releases/tag/v3.3.1) (2018-02-12)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.0...v3.3.1) and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md).**
### Improved
- Add [warnings on requests taking too long](https://github.com/coreos/etcd/pull/9288).
- e.g. `etcdserver: read-only range request "key:\"\\000\" range_end:\"\\000\" " took too long [3.389041388s] to execute`
### etcd server
- Fix [`mvcc` "unsynced" watcher restore operation](https://github.com/coreos/etcd/pull/9281).
- "unsynced" watcher is watcher that needs to be in sync with events that have happened.
- That is, "unsynced" watcher is the slow watcher that was requested on old revision.
- "unsynced" watcher restore operation was not correctly populating its underlying watcher group.
- Which possibly causes [missing events from "unsynced" watchers](https://github.com/coreos/etcd/issues/9086).
- A node gets network partitioned with a watcher on a future revision, and falls behind receiving a leader snapshot after partition gets removed. When applying this snapshot, etcd watch storage moves current synced watchers to unsynced since sync watchers might have become stale during network partition. And reset synced watcher group to restart watcher routines. Previously, there was a bug when moving from synced watcher group to unsynced, thus client would miss events when the watcher was requested to the network-partitioned node.
### Go
- Compile with [*Go 1.9.4*](https://golang.org/doc/devel/release.html#go1.9).
## [v3.3.0](https://github.com/coreos/etcd/releases/tag/v3.3.0) (2018-02-01)
See [code changes](https://github.com/coreos/etcd/compare/v3.2.0...v3.3.0) and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.3 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_3.md).**
- [v3.3.0-rc.4](https://github.com/coreos/etcd/releases/tag/v3.3.0-rc.4) (2018-01-22), see [code changes](https://github.com/coreos/etcd/compare/v3.3.0-rc.3...v3.3.0-rc.4).
- [v3.3.0-rc.3](https://github.com/coreos/etcd/releases/tag/v3.3.0-rc.3) (2018-01-17), see [code changes](https://github.com/coreos/etcd/compare/v3.3.0-rc.2...v3.3.0-rc.3).
- [v3.3.0-rc.2](https://github.com/coreos/etcd/releases/tag/v3.3.0-rc.2) (2018-01-11), see [code changes](https://github.com/coreos/etcd/compare/v3.3.0-rc.1...v3.3.0-rc.2).
- [v3.3.0-rc.1](https://github.com/coreos/etcd/releases/tag/v3.3.0-rc.1) (2018-01-02), see [code changes](https://github.com/coreos/etcd/compare/v3.3.0-rc.0...v3.3.0-rc.1).
- [v3.3.0-rc.0](https://github.com/coreos/etcd/releases/tag/v3.3.0-rc.0) (2017-12-20), see [code changes](https://github.com/coreos/etcd/compare/v3.2.0...v3.3.0-rc.0).
### Improved
- Use [`coreos/bbolt`](https://github.com/coreos/bbolt/releases) to replace [`boltdb/bolt`](https://github.com/boltdb/bolt#project-status).
- Fix [etcd database size grows until `mvcc: database space exceeded`](https://github.com/coreos/etcd/issues/8009).
- [Support database size larger than 8GiB](https://github.com/coreos/etcd/pull/7525) (8GiB is now a suggested maximum size for normal environments)
- [Reduce memory allocation](https://github.com/coreos/etcd/pull/8428) on [Range operations](https://github.com/coreos/etcd/pull/8475).
- [Rate limit](https://github.com/coreos/etcd/pull/8099) and [randomize](https://github.com/coreos/etcd/pull/8101) lease revoke on restart or leader elections.
- Prevent [spikes in Raft proposal rate](https://github.com/coreos/etcd/issues/8096).
- Support `clientv3` balancer failover under [network faults/partitions](https://github.com/coreos/etcd/issues/8711).
- Better warning on [mismatched `etcd --initial-cluster`](https://github.com/coreos/etcd/pull/8083) flag.
- etcd compares `etcd --initial-advertise-peer-urls` against corresponding `etcd --initial-cluster` URLs with forward-lookup.
- If resolved IP addresses of `etcd --initial-advertise-peer-urls` and `etcd --initial-cluster` do not match (e.g. [due to DNS error](https://github.com/coreos/etcd/pull/9210)), etcd will exit with errors.
- v3.2 error: `etcd --initial-cluster must include s1=https://s1.test:2380 given --initial-advertise-peer-urls=https://s1.test:2380`.
- v3.3 error: `failed to resolve https://s1.test:2380 to match --initial-cluster=s1=https://s1.test:2380 (failed to resolve "https://s1.test:2380" (error ...))`.
### Breaking Changes
- Require [`google.golang.org/grpc`](https://github.com/grpc/grpc-go/releases) [**`v1.7.4`**](https://github.com/grpc/grpc-go/releases/tag/v1.7.4) or [**`v1.7.5`**](https://github.com/grpc/grpc-go/releases/tag/v1.7.5).
- Deprecate [`metadata.Incoming/OutgoingContext`](https://github.com/coreos/etcd/pull/7896).
- Deprecate `grpclog.Logger`, upgrade to [`grpclog.LoggerV2`](https://github.com/coreos/etcd/pull/8533).
- Deprecate [`grpc.ErrClientConnTimeout`](https://github.com/coreos/etcd/pull/8505) errors in `clientv3`.
- Use [`MaxRecvMsgSize` and `MaxSendMsgSize`](https://github.com/coreos/etcd/pull/8437) to limit message size, in etcd server.
- Translate [gRPC status error in v3 client `Snapshot` API](https://github.com/coreos/etcd/pull/9038).
- v3 `etcdctl` [`lease timetolive LEASE_ID`](https://github.com/coreos/etcd/issues/9028) on expired lease now prints [`"lease LEASE_ID already expired"`](https://github.com/coreos/etcd/pull/9047).
- <=3.2 prints `"lease LEASE_ID granted with TTL(0s), remaining(-1s)"`.
- Replace [gRPC gateway](https://github.com/grpc-ecosystem/grpc-gateway) endpoint `/v3alpha` with [`/v3beta`](https://github.com/coreos/etcd/pull/8880).
- To deprecate [`/v3alpha`](https://github.com/coreos/etcd/issues/8125) in v3.4.
- In v3.3, `curl -L http://localhost:2379/v3alpha/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` still works as a fallback to `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'`, but `curl -L http://localhost:2379/v3alpha/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` won't work in v3.4. Use `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` instead.
- Change `etcd --auto-compaction-retention` flag to [accept string values](https://github.com/coreos/etcd/pull/8563) with [finer granularity](https://github.com/coreos/etcd/issues/8503).
- Now that `etcd --auto-compaction-retention` accepts string values, etcd configuration YAML file `auto-compaction-retention` field must be changed to `string` type.
- Previously, `--config-file etcd.config.yaml` can have `auto-compaction-retention: 24` field, now must be `auto-compaction-retention: "24"` or `auto-compaction-retention: "24h"`.
- If configured as `etcd --auto-compaction-mode periodic --auto-compaction-retention "24h"`, the time duration value for `etcd --auto-compaction-retention` flag must be valid for [`time.ParseDuration`](https://golang.org/pkg/time/#ParseDuration) function in Go.
### Dependency
- Upgrade [`boltdb/bolt`](https://github.com/boltdb/bolt#project-status) from [**`v1.3.0`**](https://github.com/boltdb/bolt/releases/tag/v1.3.0) to [`coreos/bbolt`](https://github.com/coreos/bbolt/releases) [**`v1.3.1-coreos.6`**](https://github.com/coreos/bbolt/releases/tag/v1.3.1-coreos.6).
- Upgrade [`google.golang.org/grpc`](https://github.com/grpc/grpc-go/releases) from [**`v1.2.1`**](https://github.com/grpc/grpc-go/releases/tag/v1.2.1) to [**`v1.7.5`**](https://github.com/grpc/grpc-go/releases/tag/v1.7.5).
- Upgrade [`github.com/ugorji/go/codec`](https://github.com/ugorji/go) to [**`v1.1`**](https://github.com/ugorji/go/releases/tag/v1.1), and [regenerate v2 `client`](https://github.com/coreos/etcd/pull/8721).
- Upgrade [`github.com/ugorji/go/codec`](https://github.com/ugorji/go) to [**`ugorji/go@54210f4e0`**](https://github.com/ugorji/go/commit/54210f4e076c57f351166f0ed60e67d3fca57a36), and [regenerate v2 `client`](https://github.com/coreos/etcd/pull/8574).
- Upgrade [`github.com/grpc-ecosystem/grpc-gateway`](https://github.com/grpc-ecosystem/grpc-gateway/releases) from [**`v1.2.2`**](https://github.com/grpc-ecosystem/grpc-gateway/releases/tag/v1.2.2) to [**`v1.3.0`**](https://github.com/grpc-ecosystem/grpc-gateway/releases/tag/v1.3.0).
- Upgrade [`golang.org/x/crypto/bcrypt`](https://github.com/golang/crypto) to [**`golang/crypto@6c586e17d`**](https://github.com/golang/crypto/commit/6c586e17d90a7d08bbbc4069984180dce3b04117).
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#v3-3) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd --listen-metrics-urls`](https://github.com/coreos/etcd/pull/8242) flag for additional `/metrics` and `/health` endpoints.
- Useful for [bypassing critical APIs when monitoring etcd](https://github.com/coreos/etcd/issues/8060).
- Add [`etcd_server_version`](https://github.com/coreos/etcd/pull/8960) Prometheus metric.
- To replace [Kubernetes `etcd-version-monitor`](https://github.com/coreos/etcd/issues/8948).
- Add [`etcd_debugging_mvcc_db_compaction_keys_total`](https://github.com/coreos/etcd/pull/8280) Prometheus metric.
- Add [`etcd_debugging_server_lease_expired_total`](https://github.com/coreos/etcd/pull/8064) Prometheus metric.
- To improve [lease revoke monitoring](https://github.com/coreos/etcd/issues/8050).
- Document [Prometheus 2.0 rules](https://github.com/coreos/etcd/pull/8879).
- Initialize gRPC server [metrics with zero values](https://github.com/coreos/etcd/pull/8878).
- Fix [range/put/delete operation metrics](https://github.com/coreos/etcd/pull/8054) with transaction.
- `etcd_debugging_mvcc_range_total`
- `etcd_debugging_mvcc_put_total`
- `etcd_debugging_mvcc_delete_total`
- `etcd_debugging_mvcc_txn_total`
- Fix [`etcd_debugging_mvcc_keys_total`](https://github.com/coreos/etcd/pull/8390) on restore.
- Fix [`etcd_debugging_mvcc_db_total_size_in_bytes`](https://github.com/coreos/etcd/pull/8120) on restore.
- Also change to [`prometheus.NewGaugeFunc`](https://github.com/coreos/etcd/pull/8150).
### Security, Authentication
See [security doc](https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md) for more details.
- Add [CRL based connection rejection](https://github.com/coreos/etcd/pull/8124) to manage [revoked certs](https://github.com/coreos/etcd/issues/4034).
- Document [TLS authentication changes](https://github.com/coreos/etcd/pull/8895).
- [Server accepts connections if IP matches, without checking DNS entries](https://github.com/coreos/etcd/pull/8223). For instance, if peer cert contains IP addresses and DNS names in Subject Alternative Name (SAN) field, and the remote IP address matches one of those IP addresses, server just accepts connection without further checking the DNS names.
- [Server supports reverse-lookup on wildcard DNS `SAN`](https://github.com/coreos/etcd/pull/8281). For instance, if peer cert contains only DNS names (no IP addresses) in Subject Alternative Name (SAN) field, server first reverse-lookups the remote IP address to get a list of names mapping to that address (e.g. `nslookup IPADDR`). Then accepts the connection if those names have a matching name with peer cert's DNS names (either by exact or wildcard match). If none is matched, server forward-lookups each DNS entry in peer cert (e.g. look up `example.default.svc` when the entry is `*.example.default.svc`), and accepts connection only when the host's resolved addresses have the matching IP address with the peer's remote IP address.
- Add [`etcd --peer-cert-allowed-cn`](https://github.com/coreos/etcd/pull/8616) flag.
- To support [CommonName(CN) based auth](https://github.com/coreos/etcd/issues/8262) for inter peer connection.
- [Swap priority](https://github.com/coreos/etcd/pull/8594) of cert CommonName(CN) and username + password.
- To address ["username and password specified in the request should take priority over CN in the cert"](https://github.com/coreos/etcd/issues/8584).
- Protect [lease revoke with auth](https://github.com/coreos/etcd/pull/8031).
- Provide user's role on [auth permission error](https://github.com/coreos/etcd/pull/8164).
- Fix [auth store panic with disabled token](https://github.com/coreos/etcd/pull/8695).
### etcd server
- Add [`etcd --experimental-initial-corrupt-check`](https://github.com/coreos/etcd/pull/8554) flag to [check cluster database hashes before serving client/peer traffic](https://github.com/coreos/etcd/issues/8313).
- `etcd --experimental-initial-corrupt-check=false` by default.
- v3.4 will enable `--initial-corrupt-check=true` by default.
- Add [`etcd --experimental-corrupt-check-time`](https://github.com/coreos/etcd/pull/8420) flag to [raise corrupt alarm monitoring](https://github.com/coreos/etcd/issues/7125).
- `etcd --experimental-corrupt-check-time=0s` disabled by default.
- Add [`etcd --experimental-enable-v2v3`](https://github.com/coreos/etcd/pull/8407) flag to [emulate v2 API with v3](https://github.com/coreos/etcd/issues/6925).
- `etcd --experimental-enable-v2v3=false` by default.
- Add [`etcd --max-txn-ops`](https://github.com/coreos/etcd/pull/7976) flag to [configure maximum number operations in transaction](https://github.com/coreos/etcd/issues/7826).
- Add [`etcd --max-request-bytes`](https://github.com/coreos/etcd/pull/7968) flag to [configure maximum client request size](https://github.com/coreos/etcd/issues/7923).
- If not configured, it defaults to 1.5 MiB.
- Add [`etcd --client-crl-file`, `--peer-crl-file`](https://github.com/coreos/etcd/pull/8124) flags for [Certificate revocation list](https://github.com/coreos/etcd/issues/4034).
- Add [`etcd --peer-cert-allowed-cn`](https://github.com/coreos/etcd/pull/8616) flag to support [CN-based auth for inter-peer connection](https://github.com/coreos/etcd/issues/8262).
- Add [`etcd --listen-metrics-urls`](https://github.com/coreos/etcd/pull/8242) flag for additional `/metrics` and `/health` endpoints.
- Support [additional (non) TLS `/metrics` endpoints for a TLS-enabled cluster](https://github.com/coreos/etcd/pull/8282).
- e.g. `etcd --listen-metrics-urls=https://localhost:2378,http://localhost:9379` to serve `/metrics` and `/health` on secure port 2378 and insecure port 9379.
- Useful for [bypassing critical APIs when monitoring etcd](https://github.com/coreos/etcd/issues/8060).
- Add [`etcd --auto-compaction-mode`](https://github.com/coreos/etcd/pull/8123) flag to [support revision-based compaction](https://github.com/coreos/etcd/issues/8098).
- Change `etcd --auto-compaction-retention` flag to [accept string values](https://github.com/coreos/etcd/pull/8563) with [finer granularity](https://github.com/coreos/etcd/issues/8503).
- Now that `etcd --auto-compaction-retention` accepts string values, etcd configuration YAML file `auto-compaction-retention` field must be changed to `string` type.
- Previously, `etcd --config-file etcd.config.yaml` can have `auto-compaction-retention: 24` field, now must be `auto-compaction-retention: "24"` or `auto-compaction-retention: "24h"`.
- If configured as `--auto-compaction-mode periodic --auto-compaction-retention "24h"`, the time duration value for `etcd --auto-compaction-retention` flag must be valid for [`time.ParseDuration`](https://golang.org/pkg/time/#ParseDuration) function in Go.
- e.g. `etcd --auto-compaction-mode=revision --auto-compaction-retention=1000` automatically `Compact` on `"latest revision" - 1000` every 5-minute (when latest revision is 30000, compact on revision 29000).
- e.g. `etcd --auto-compaction-mode=periodic --auto-compaction-retention=72h` automatically `Compact` with 72-hour retention windown, for every 7.2-hour.
- e.g. `etcd --auto-compaction-mode=periodic --auto-compaction-retention=30m` automatically `Compact` with 30-minute retention windown, for every 3-minute.
- Periodic compactor continues to record latest revisions for every 1/10 of given compaction period (e.g. 1-hour when `etcd --auto-compaction-mode=periodic --auto-compaction-retention=10h`).
- For every 1/10 of given compaction period, compactor uses the last revision that was fetched before compaction period, to discard historical data.
- The retention window of compaction period moves for every 1/10 of given compaction period.
- For instance, when hourly writes are 100 and `--auto-compaction-retention=10`, v3.1 compacts revision 1000, 2000, and 3000 for every 10-hour, while v3.2.x, v3.3.0, v3.3.1, and v3.3.2 compact revision 1000, 1100, and 1200 for every 1-hour. Futhermore, when writes per minute are 1000, v3.3.0, v3.3.1, and v3.3.2 with `--auto-compaction-mode=periodic --auto-compaction-retention=30m` compact revision 30000, 33000, and 36000, for every 3-minute with more finer granularity.
- Whether compaction succeeds or not, this process repeats for every 1/10 of given compaction period. If compaction succeeds, it just removes compacted revision from historical revision records.
- Add [`etcd --grpc-keepalive-min-time`, `etcd --grpc-keepalive-interval`, `etcd --grpc-keepalive-timeout`](https://github.com/coreos/etcd/pull/8535) flags to configure server-side keepalive policies.
- Serve [`/health` endpoint as unhealthy](https://github.com/coreos/etcd/pull/8272) when [alarm (e.g. `NOSPACE`) is raised or there's no leader](https://github.com/coreos/etcd/issues/8207).
- Define [`etcdhttp.Health`](https://godoc.org/github.com/coreos/etcd/etcdserver/api/etcdhttp#Health) struct with JSON encoder.
- Note that `"health"` field is [`string` type, not `bool`](https://github.com/coreos/etcd/pull/9143).
- e.g. `{"health":"false"}`, `{"health":"true"}`
- [Remove `"errors"` field](https://github.com/coreos/etcd/pull/9162) since `v3.3.0-rc.3` (did exist only in `v3.3.0-rc.0`, `v3.3.0-rc.1`, `v3.3.0-rc.2`).
- Move [logging setup to embed package](https://github.com/coreos/etcd/pull/8810)
- Disable gRPC server info-level logs by default (can be enabled with `etcd --debug` flag).
- Use [monotonic time in Go 1.9](https://github.com/coreos/etcd/pull/8507) for `lease` package.
- Warn on [empty hosts in advertise URLs](https://github.com/coreos/etcd/pull/8384).
- Address [advertise client URLs accepts empty hosts](https://github.com/coreos/etcd/issues/8379).
- etcd v3.4 will exit on this error.
- e.g. `etcd --advertise-client-urls=http://:2379`.
- Warn on [shadowed environment variables](https://github.com/coreos/etcd/pull/8385).
- Address [error on shadowed environment variables](https://github.com/coreos/etcd/issues/8380).
- etcd v3.4 will exit on this error.
### API
- Support [ranges in transaction comparisons](https://github.com/coreos/etcd/pull/8025) for [disconnected linearized reads](https://github.com/coreos/etcd/issues/7924).
- Add [nested transactions](https://github.com/coreos/etcd/pull/8102) to extend [proxy use cases](https://github.com/coreos/etcd/issues/7857).
- Add [lease comparison target in transaction](https://github.com/coreos/etcd/pull/8324).
- Add [lease list](https://github.com/coreos/etcd/pull/8358).
- Add [hash by revision](https://github.com/coreos/etcd/pull/8263) for [better corruption checking against boltdb](https://github.com/coreos/etcd/issues/8016).
### client v3
- Add [health balancer](https://github.com/coreos/etcd/pull/8545) to fix [watch API hangs](https://github.com/coreos/etcd/issues/7247), improve [endpoint switch under network faults](https://github.com/coreos/etcd/issues/7941).
- [Refactor balancer](https://github.com/coreos/etcd/pull/8840) and add [client-side keepalive pings](https://github.com/coreos/etcd/pull/8199) to handle [network partitions](https://github.com/coreos/etcd/issues/8711).
- Add [`MaxCallSendMsgSize` and `MaxCallRecvMsgSize`](https://github.com/coreos/etcd/pull/9047) fields to [`clientv3.Config`](https://godoc.org/github.com/coreos/etcd/clientv3#Config).
- Fix [exceeded response size limit error in client-side](https://github.com/coreos/etcd/issues/9043).
- Address [kubernetes#51099](https://github.com/kubernetes/kubernetes/issues/51099).
- In previous versions(v3.2.10, v3.2.11), client response size was limited to only 4 MiB.
- `MaxCallSendMsgSize` default value is 2 MiB, if not configured.
- `MaxCallRecvMsgSize` default value is `math.MaxInt32`, if not configured.
- Accept [`Compare_LEASE`](https://github.com/coreos/etcd/pull/8324) in [`clientv3.Compare`](https://godoc.org/github.com/coreos/etcd/clientv3#Compare).
- Add [`LeaseValue` helper](https://github.com/coreos/etcd/pull/8488) to `Cmp` `LeaseID` values in `Txn`.
- Add [`MoveLeader`](https://github.com/coreos/etcd/pull/8153) to `Maintenance`.
- Add [`HashKV`](https://github.com/coreos/etcd/pull/8351) to `Maintenance`.
- Add [`Leases`](https://github.com/coreos/etcd/pull/8358) to `Lease`.
- Add [`clientv3/ordering`](https://github.com/coreos/etcd/pull/8092) for enforce [ordering in serialized requests](https://github.com/coreos/etcd/issues/7623).
- Fix ["put at-most-once" violation](https://github.com/coreos/etcd/pull/8335).
- Fix [`WatchResponse.Canceled`](https://github.com/coreos/etcd/pull/8283) on [compacted watch request](https://github.com/coreos/etcd/issues/8231).
- Fix [`concurrency/stm` `Put` with serializable snapshot](https://github.com/coreos/etcd/pull/8439).
- Use store revision from first fetch to resolve write conflicts instead of modified revision.
### etcdctl v3
- Add [`etcdctl --discovery-srv`](https://github.com/coreos/etcd/pull/8462) flag.
- Add [`etcdctl --keepalive-time`, `--keepalive-timeout`](https://github.com/coreos/etcd/pull/8663) flags.
- Add [`etcdctl lease list`](https://github.com/coreos/etcd/pull/8358) command.
- Add [`etcdctl lease keep-alive --once`](https://github.com/coreos/etcd/pull/8775) flag.
- Make [`lease timetolive LEASE_ID`](https://github.com/coreos/etcd/issues/9028) on expired lease print [`lease LEASE_ID already expired`](https://github.com/coreos/etcd/pull/9047).
- <=3.2 prints `lease LEASE_ID granted with TTL(0s), remaining(-1s)`.
- Add [`etcdctl snapshot restore --wal-dir`](https://github.com/coreos/etcd/pull/9124) flag.
- Add [`etcdctl defrag --data-dir`](https://github.com/coreos/etcd/pull/8367) flag.
- Add [`etcdctl move-leader`](https://github.com/coreos/etcd/pull/8153) command.
- Add [`etcdctl endpoint hashkv`](https://github.com/coreos/etcd/pull/8351) command.
- Add [`etcdctl endpoint --cluster`](https://github.com/coreos/etcd/pull/8143) flag, equivalent to [v2 `etcdctl cluster-health`](https://github.com/coreos/etcd/issues/8117).
- Make `etcdctl endpoint health` command terminate with [non-zero exit code on unhealthy status](https://github.com/coreos/etcd/pull/8342).
- Add [`etcdctl lock --ttl`](https://github.com/coreos/etcd/pull/8370) flag.
- Support [`etcdctl watch [key] [range_end] -- [exec-command…]`](https://github.com/coreos/etcd/pull/8919), equivalent to [v2 `etcdctl exec-watch`](https://github.com/coreos/etcd/issues/8814).
- Make `etcdctl watch -- [exec-command]` set environmental variables [`ETCD_WATCH_REVISION`, `ETCD_WATCH_EVENT_TYPE`, `ETCD_WATCH_KEY`, `ETCD_WATCH_VALUE`](https://github.com/coreos/etcd/pull/9142) for each event.
- Support [`etcdctl watch` with environmental variables `ETCDCTL_WATCH_KEY` and `ETCDCTL_WATCH_RANGE_END`](https://github.com/coreos/etcd/pull/9142).
- Enable [`clientv3.WithRequireLeader(context.Context)` for `watch`](https://github.com/coreos/etcd/pull/8672) command.
- Print [`"del"` instead of `"delete"`](https://github.com/coreos/etcd/pull/8297) in `txn` interactive mode.
- Print [`ETCD_INITIAL_ADVERTISE_PEER_URLS` in `member add`](https://github.com/coreos/etcd/pull/8332).
### etcdctl v3
- Handle [empty key permission](https://github.com/coreos/etcd/pull/8514) in `etcdctl`.
### etcdctl v2
- Add [`etcdctl backup --with-v3`](https://github.com/coreos/etcd/pull/8479) flag.
### gRPC Proxy
- Add [`grpc-proxy start --experimental-leasing-prefix`](https://github.com/coreos/etcd/pull/8341) flag.
- For disconnected linearized reads.
- Based on [V system leasing](https://github.com/coreos/etcd/issues/6065).
- See ["Disconnected consistent reads with etcd" blog post](https://coreos.com/blog/coreos-labs-disconnected-consistent-reads-with-etcd).
- Add [`grpc-proxy start --experimental-serializable-ordering`](https://github.com/coreos/etcd/pull/8315) flag.
- To ensure serializable reads have monotonically increasing store revisions across endpoints.
- Add [`grpc-proxy start --metrics-addr`](https://github.com/coreos/etcd/pull/8242) flag for an additional `/metrics` endpoint.
- Set `--metrics-addr=http://[HOST]:9379` to serve `/metrics` in insecure port 9379.
- Serve [`/health` endpoint in grpc-proxy](https://github.com/coreos/etcd/pull/8322).
- Add [`grpc-proxy start --debug`](https://github.com/coreos/etcd/pull/8994) flag.
- Add [`grpc-proxy start --max-send-bytes`](https://github.com/coreos/etcd/pull/9250) flag to [configure maximum client request size](https://github.com/coreos/etcd/issues/7923).
- Add [`grpc-proxy start --max-recv-bytes`](https://github.com/coreos/etcd/pull/9250) flag to [configure maximum client request size](https://github.com/coreos/etcd/issues/7923).
- Fix [Snapshot API error handling](https://github.com/coreos/etcd/commit/dbd16d52fbf81e5fd806d21ff5e9148d5bf203ab).
- Fix [KV API `PrevKv` flag handling](https://github.com/coreos/etcd/pull/8366).
- Fix [KV API `KeysOnly` flag handling](https://github.com/coreos/etcd/pull/8552).
### gRPC gateway
- Replace [gRPC gateway](https://github.com/grpc-ecosystem/grpc-gateway) endpoint `/v3alpha` with [`/v3beta`](https://github.com/coreos/etcd/pull/8880).
- To deprecate [`/v3alpha`](https://github.com/coreos/etcd/issues/8125) in v3.4.
- In v3.3, `curl -L http://localhost:2379/v3alpha/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` still works as a fallback to `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'`, but `curl -L http://localhost:2379/v3alpha/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` won't work in v3.4. Use `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` instead.
- Support ["authorization" token](https://github.com/coreos/etcd/pull/7999).
- Support [websocket for bi-directional streams](https://github.com/coreos/etcd/pull/8257).
- Fix [`Watch` API with gRPC gateway](https://github.com/coreos/etcd/issues/8237).
- Upgrade gRPC gateway to [v1.3.0](https://github.com/coreos/etcd/issues/8838).
### etcd server
- Fix [backend database in-memory index corruption](https://github.com/coreos/etcd/pull/8127) issue on restore (only 3.2.0 is affected).
- Fix [watch restore from snapshot](https://github.com/coreos/etcd/pull/8427).
- Fix [`mvcc/backend.defragdb` nil-pointer dereference on create bucket failure](https://github.com/coreos/etcd/pull/9119).
- Fix [server crash](https://github.com/coreos/etcd/pull/8010) on [invalid transaction request from gRPC gateway](https://github.com/coreos/etcd/issues/7889).
- Prevent [server panic from member update/add](https://github.com/coreos/etcd/pull/9174) with [wrong scheme URLs](https://github.com/coreos/etcd/issues/9173).
- Make [peer dial timeout longer](https://github.com/coreos/etcd/pull/8599).
- See [coreos/etcd-operator#1300](https://github.com/coreos/etcd-operator/issues/1300) for more detail.
- Make server [wait up to request time-out](https://github.com/coreos/etcd/pull/8267) with [pending RPCs](https://github.com/coreos/etcd/issues/8224).
- Fix [`grpc.Server` panic on `GracefulStop`](https://github.com/coreos/etcd/pull/8987) with [TLS-enabled server](https://github.com/coreos/etcd/issues/8916).
- Fix ["multiple peer URLs cannot start" issue](https://github.com/coreos/etcd/issues/8383).
- Fix server-side auth so [concurrent auth operations do not return old revision error](https://github.com/coreos/etcd/pull/8442).
- Handle [WAL renaming failure on Windows](https://github.com/coreos/etcd/pull/8286).
- Upgrade [`coreos/go-systemd`](https://github.com/coreos/go-systemd/releases) to `v15` (see https://github.com/coreos/go-systemd/releases/tag/v15).
- [Put back `/v2/machines`](https://github.com/coreos/etcd/pull/8062) endpoint for python-etcd wrapper.
### client v2
- [Fail-over v2 client](https://github.com/coreos/etcd/pull/8519) to next endpoint on [oneshot failure](https://github.com/coreos/etcd/issues/8515).
### Package `raft`
- Add [non-voting member](https://github.com/coreos/etcd/pull/8751).
- To implement [Raft thesis 4.2.1 Catching up new servers](https://github.com/coreos/etcd/issues/8568).
- `Learner` node does not vote or promote itself.
### Other
- Support previous two minor versions (see our [new release policy](https://github.com/coreos/etcd/pull/8805)).
- `v3.3.x` is the last release cycle that supports `ACI`.
- [AppC was officially suspended](https://github.com/appc/spec#-disclaimer-), as of late 2016.
- [`acbuild`](https://github.com/containers/build#this-project-is-currently-unmaintained) is not maintained anymore.
- `*.aci` files won't be available from etcd v3.4 release.
- Add container registry [`gcr.io/etcd-development/etcd`](https://gcr.io/etcd-development/etcd).
- [quay.io/coreos/etcd](https://quay.io/coreos/etcd) is still supported as secondary.
### Go
- Require [*Go 1.9+*](https://github.com/coreos/etcd/issues/6174).
- Compile with [*Go 1.9.3*](https://golang.org/doc/devel/release.html#go1.9).
- Deprecate [`golang.org/x/net/context`](https://github.com/coreos/etcd/pull/8511).

View File

@ -1,424 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.3](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.3.md).
## v3.4.0 (TBD 2018-09)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.0...v3.4.0) and [v3.4 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_4.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.4 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_4.md).**
### Improved
- Rewrite [client balancer](https://github.com/coreos/etcd/pull/9860) with [new gRPC balancer interface](https://github.com/coreos/etcd/issues/9106).
- Add [backoff on watch retries on transient errors](https://github.com/coreos/etcd/pull/9840).
- Add [jitter to watch progress notify](https://github.com/coreos/etcd/pull/9278) to prevent [spikes in `etcd_network_client_grpc_sent_bytes_total`](https://github.com/coreos/etcd/issues/9246).
- Improve [slow request apply warning log](https://github.com/coreos/etcd/pull/9288).
- e.g. `read-only range request "key:\"/a\" range_end:\"/b\" " with result "range_response_count:3 size:96" took too long (97.966µs) to execute`.
- Redact [request value field](https://github.com/coreos/etcd/pull/9822).
- Provide [response size](https://github.com/coreos/etcd/pull/9826).
- Improve [TLS setup error logging](https://github.com/coreos/etcd/pull/9518) to help debug [TLS-enabled cluster configuring issues](https://github.com/coreos/etcd/issues/9400).
- Improve [long-running concurrent read transactions under light write workloads](https://github.com/coreos/etcd/pull/9296).
- Previously, periodic commit on pending writes blocks incoming read transactions, even if there is no pending write.
- Now, periodic commit operation does not block concurrent read transactions, thus improves long-running read transaction performance.
- Improve [Raft Read Index timeout warning messages](https://github.com/coreos/etcd/pull/9897).
- Adjust [election timeout on server restart](https://github.com/coreos/etcd/pull/9415) to reduce [disruptive rejoining servers](https://github.com/coreos/etcd/issues/9333).
- Previously, etcd fast-forwards election ticks on server start, with only one tick left for leader election. This is to speed up start phase, without having to wait until all election ticks elapse. Advancing election ticks is useful for cross datacenter deployments with larger election timeouts. However, it was affecting cluster availability if the last tick elapses before leader contacts the restarted node.
- Now, when etcd restarts, it adjusts election ticks with more than one tick left, thus more time for leader to prevent disruptive restart.
- Add [Raft Pre-Vote feature](https://github.com/coreos/etcd/pull/9352) to reduce [disruptive rejoining servers](https://github.com/coreos/etcd/issues/9333).
- For instance, a flaky(or rejoining) member may drop in and out, and start campaign. This member will end up with a higher term, and ignore all incoming messages with lower term. In this case, a new leader eventually need to get elected, thus disruptive to cluster availability. Raft implements Pre-Vote phase to prevent this kind of disruptions. If enabled, Raft runs an additional phase of election to check if pre-candidate can get enough votes to win an election.
- Adjust [periodic compaction retention window](https://github.com/coreos/etcd/pull/9485).
- e.g. `etcd --auto-compaction-mode=revision --auto-compaction-retention=1000` automatically `Compact` on `"latest revision" - 1000` every 5-minute (when latest revision is 30000, compact on revision 29000).
- e.g. Previously, `etcd --auto-compaction-mode=periodic --auto-compaction-retention=24h` automatically `Compact` with 24-hour retention windown for every 2.4-hour. Now, `Compact` happens for every 1-hour.
- e.g. Previously, `etcd --auto-compaction-mode=periodic --auto-compaction-retention=30m` automatically `Compact` with 30-minute retention windown for every 3-minute. Now, `Compact` happens for every 30-minute.
- Periodic compactor keeps recording latest revisions for every compaction period when given period is less than 1-hour, or for every 1-hour when given compaction period is greater than 1-hour (e.g. 1-hour when `etcd --auto-compaction-mode=periodic --auto-compaction-retention=24h`).
- For every compaction period or 1-hour, compactor uses the last revision that was fetched before compaction period, to discard historical data.
- The retention window of compaction period moves for every given compaction period or hour.
- For instance, when hourly writes are 100 and `etcd --auto-compaction-mode=periodic --auto-compaction-retention=24h`, `v3.2.x`, `v3.3.0`, `v3.3.1`, and `v3.3.2` compact revision 2400, 2640, and 2880 for every 2.4-hour, while `v3.3.3` *or later* compacts revision 2400, 2500, 2600 for every 1-hour.
- Futhermore, when `etcd --auto-compaction-mode=periodic --auto-compaction-retention=30m` and writes per minute are about 1000, `v3.3.0`, `v3.3.1`, and `v3.3.2` compact revision 30000, 33000, and 36000, for every 3-minute, while `v3.3.3` *or later* compacts revision 30000, 60000, and 90000, for every 30-minute.
- Improve [lease expire/revoke operation performance](https://github.com/coreos/etcd/pull/9418), address [lease scalability issue](https://github.com/coreos/etcd/issues/9496).
- Make [Lease `Lookup` non-blocking with concurrent `Grant`/`Revoke`](https://github.com/coreos/etcd/pull/9229).
- Make etcd server return `raft.ErrProposalDropped` on internal Raft proposal drop in [v3 applier](https://github.com/coreos/etcd/pull/9549) and [v2 applier](https://github.com/coreos/etcd/pull/9558).
- e.g. a node is removed from cluster, or [`raftpb.MsgProp` arrives at current leader while there is an ongoing leadership transfer](https://github.com/coreos/etcd/issues/8975).
- Add [`snapshot`](https://github.com/coreos/etcd/pull/9118) package for easier snapshot workflow (see [`godoc.org/github.com/etcd/clientv3/snapshot`](https://godoc.org/github.com/coreos/etcd/clientv3/snapshot) for more).
- Improve [functional tester](https://github.com/coreos/etcd/tree/master/functional) coverage: [proxy layer to run network fault tests in CI](https://github.com/coreos/etcd/pull/9081), [TLS is enabled both for server and client](https://github.com/coreos/etcd/pull/9534), [liveness mode](https://github.com/coreos/etcd/issues/9230), [shuffle test sequence](https://github.com/coreos/etcd/issues/9381), [membership reconfiguration failure cases](https://github.com/coreos/etcd/pull/9564), [disastrous quorum loss and snapshot recover from a seed member](https://github.com/coreos/etcd/pull/9565), [embedded etcd](https://github.com/coreos/etcd/pull/9572).
- Improve [index compaction blocking](https://github.com/coreos/etcd/pull/9511) by using a copy on write clone to avoid holding the lock for the traversal of the entire index.
- Update [JWT methods](https://github.com/coreos/etcd/pull/9883) to allow for use of any supported signature method/algorithm.
- Add [Lease checkpointing](https://github.com/coreos/etcd/pull/9924) to persist remaining TTLs to the consensus log periodically so that long lived leases progress toward expiry in the presence of leader elections and server restarts.
### Breaking Changes
- Make [`ETCDCTL_API=3 etcdctl` default](https://github.com/coreos/etcd/issues/9600).
- Now, `etcdctl set foo bar` must be `ETCDCTL_API=2 etcdctl set foo bar`.
- Now, `ETCDCTL_API=3 etcdctl put foo bar` could be just `etcdctl put foo bar`.
- **Remove `etcd --ca-file` flag**, instead [use `etcd --trusted-ca-file`](https://github.com/coreos/etcd/pull/9470) (`etcd --ca-file` flag has been marked deprecated since v2.1).
- **Remove `etcd --peer-ca-file` flag**, instead [use `etcd --peer-trusted-ca-file`](https://github.com/coreos/etcd/pull/9470) (`etcd --peer-ca-file` flag has been marked deprecated since v2.1).
- **Remove `pkg/transport.TLSInfo.CAFile` field**, instead [use `pkg/transport.TLSInfo.TrustedCAFile`](https://github.com/coreos/etcd/pull/9470) (`CAFile` field has been marked deprecated since v2.1).
- Deprecate `latest` [release container](https://console.cloud.google.com/gcr/images/etcd-development/GLOBAL/etcd) tag.
- **`docker pull gcr.io/etcd-development/etcd:latest` would not be up-to-date**.
- Deprecate [minor](https://semver.org/) version [release container](https://console.cloud.google.com/gcr/images/etcd-development/GLOBAL/etcd) tags.
- `docker pull gcr.io/etcd-development/etcd:v3.3` would still work.
- **`docker pull gcr.io/etcd-development/etcd:v3.4` would not work**.
- Use **`docker pull gcr.io/etcd-development/etcd:v3.4.x`** instead, with the exact patch version.
- Drop [ACIs from official release](https://github.com/coreos/etcd/pull/9059).
- [AppC was officially suspended](https://github.com/appc/spec#-disclaimer-), as of late 2016.
- [`acbuild`](https://github.com/containers/build#this-project-is-currently-unmaintained) is not maintained anymore.
- `*.aci` files are not available from `v3.4` release.
- Exit on [empty hosts in advertise URLs](https://github.com/coreos/etcd/pull/8786).
- Address [advertise client URLs accepts empty hosts](https://github.com/coreos/etcd/issues/8379).
- e.g. exit with error on `--advertise-client-urls=http://:2379`.
- e.g. exit with error on `--initial-advertise-peer-urls=http://:2380`.
- Exit on [shadowed environment variables](https://github.com/coreos/etcd/pull/9382).
- Address [error on shadowed environment variables](https://github.com/coreos/etcd/issues/8380).
- e.g. exit with error on `ETCD_NAME=abc etcd --name=def`.
- e.g. exit with error on `ETCD_INITIAL_CLUSTER_TOKEN=abc etcd --initial-cluster-token=def`.
- e.g. exit with error on `ETCDCTL_ENDPOINTS=abc.com ETCDCTL_API=3 etcdctl endpoint health --endpoints=def.com`.
- Change [`etcdserverpb.AuthRoleRevokePermissionRequest/key,range_end` fields type from `string` to `bytes`](https://github.com/coreos/etcd/pull/9433).
- Rename [`etcd_debugging_mvcc_db_total_size_in_bytes` Prometheus metric to `etcd_mvcc_db_total_size_in_bytes`](https://github.com/coreos/etcd/pull/9819).
- Rename `etcdserver.ServerConfig.SnapCount` field to `etcdserver.ServerConfig.SnapshotCount`, to be consistent with the flag name `etcd --snapshot-count`.
- Rename `embed.Config.SnapCount` field to [`embed.Config.SnapshotCount`](https://github.com/coreos/etcd/pull/9745), to be consistent with the flag name `etcd --snapshot-count`.
- Change [`embed.Config.CorsInfo` in `*cors.CORSInfo` type to `embed.Config.CORS` in `map[string]struct{}` type](https://github.com/coreos/etcd/pull/9490).
- Remove [`embed.Config.SetupLogging`](https://github.com/coreos/etcd/pull/9572).
- Now logger is set up automatically based on [`embed.Config.Logger`, `embed.Config.LogOutputs`, `embed.Config.Debug` fields](https://github.com/coreos/etcd/pull/9572).
- Rename [`etcd --log-output` to `etcd --log-outputs`](https://github.com/coreos/etcd/pull/9624) to support multiple log outputs.
- **`etcd --log-output`** will be deprecated in v3.5.
- Rename [**`embed.Config.LogOutput`** to **`embed.Config.LogOutputs`**](https://github.com/coreos/etcd/pull/9624) to support multiple log outputs.
- Change [**`embed.Config.LogOutputs`** type from `string` to `[]string`](https://github.com/coreos/etcd/pull/9579) to support multiple log outputs.
- Now that `etcd --log-outputs` accepts multiple writers, etcd configuration YAML file `log-outputs` field must be changed to `[]string` type.
- Previously, `etcd --config-file etcd.config.yaml` can have `log-outputs: default` field, now must be `log-outputs: [default]`.
- Change v3 `etcdctl snapshot` exit codes with [`snapshot` package](https://github.com/coreos/etcd/pull/9118/commits/df689f4280e1cce4b9d61300be13ca604d41670a).
- Exit on error with exit code 1 (no more exit code 5 or 6 on `snapshot save/restore` commands).
- Migrate dependency management tool from `glide` to [`golang/dep`](https://github.com/coreos/etcd/pull/9155).
- <= 3.3 puts `vendor` directory under `cmd/vendor` directory to [prevent conflicting transitive dependencies](https://github.com/coreos/etcd/issues/4913).
- 3.4 moves `cmd/vendor` directory to `vendor` at repository root.
- Remove recursive symlinks in `cmd` directory.
- Now `go get/install/build` on `etcd` packages (e.g. `clientv3`, `tools/benchmark`) enforce builds with etcd `vendor` directory.
- Replace [gRPC gateway](https://github.com/grpc-ecosystem/grpc-gateway) endpoint `/v3beta` with [`/v3`](https://github.com/coreos/etcd/pull/9298).
- Deprecated [`/v3alpha`](https://github.com/coreos/etcd/pull/9298).
- To deprecate [`/v3beta`](https://github.com/coreos/etcd/issues/9189) in v3.5.
- In v3.4, `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` still works as a fallback to `curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'`, but `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` won't work in v3.5. Use `curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` instead.
- Change [`wal` package function signatures](https://github.com/coreos/etcd/pull/9572) to support [structured logger and logging to file](https://github.com/coreos/etcd/issues/9438) in server-side.
- Previously, `Open(dirpath string, snap walpb.Snapshot) (*WAL, error)`, now `Open(lg *zap.Logger, dirpath string, snap walpb.Snapshot) (*WAL, error)`.
- Previously, `OpenForRead(dirpath string, snap walpb.Snapshot) (*WAL, error)`, now `OpenForRead(lg *zap.Logger, dirpath string, snap walpb.Snapshot) (*WAL, error)`.
- Previously, `Repair(dirpath string) bool`, now `Repair(lg *zap.Logger, dirpath string) bool`.
- Previously, `Create(dirpath string, metadata []byte) (*WAL, error)`, now `Create(lg *zap.Logger, dirpath string, metadata []byte) (*WAL, error)`.
- Remove [`pkg/cors` package](https://github.com/coreos/etcd/pull/9490).
- Change [`etcd --experimental-enable-v2v3`](TODO) flag to `etcd --enable-v2v3`; v2 storage emulation is now stable.
- Move internal packages to `etcdserver`.
- `"github.com/coreos/etcd/alarm"` to `"github.com/coreos/etcd/etcdserver/api/v3alarm"`.
- `"github.com/coreos/etcd/compactor"` to `"github.com/coreos/etcd/etcdserver/api/v3compactor"`.
- `"github.com/coreos/etcd/discovery"` to `"github.com/coreos/etcd/etcdserver/api/v2discovery"`.
- `"github.com/coreos/etcd/etcdserver/auth"` to `"github.com/coreos/etcd/etcdserver/api/v2auth"`.
- `"github.com/coreos/etcd/etcdserver/membership"` to `"github.com/coreos/etcd/etcdserver/api/membership"`.
- `"github.com/coreos/etcd/etcdserver/stats"` to `"github.com/coreos/etcd/etcdserver/api/v2stats"`.
- `"github.com/coreos/etcd/error"` to `"github.com/coreos/etcd/etcdserver/api/v2error"`.
- `"github.com/coreos/etcd/rafthttp"` to `"github.com/coreos/etcd/etcdserver/api/rafthttp"`.
- `"github.com/coreos/etcd/snap"` to `"github.com/coreos/etcd/etcdserver/api/snap"`.
- `"github.com/coreos/etcd/store"` to `"github.com/coreos/etcd/etcdserver/api/v2store"`.
- Change [snapshot file permissions](https://github.com/coreos/etcd/pull/9977): On Linux, the snapshot file changes from readable by all (mode 0644) to readable by the user only (mode 0600).
### Dependency
- Upgrade [`google.golang.org/grpc`](https://github.com/grpc/grpc-go/releases) from [**`v1.7.5`**](https://github.com/grpc/grpc-go/releases/tag/v1.7.5) to [**`v1.13.0`**](https://github.com/grpc/grpc-go/releases/tag/v1.13.0).
- Upgrade [`github.com/golang/protobuf`](https://github.com/golang/protobuf/releases) from [**`golang/protobuf@1e59b77b5`**](https://github.com/golang/protobuf/commit/1e59b77b52bf8e4b449a57e6f79f21226d571845) to [**`v1.1.0`**](https://github.com/golang/protobuf/releases/tag/v1.1.0).
- Upgrade [`golang.org/x/crypto`](https://github.com/golang/crypto) from [**`crypto@9419663f5`**](https://github.com/golang/crypto/commit/9419663f5a44be8b34ca85f08abc5fe1be11f8a3) to [**`crypto@8ac0e0d97`**](https://github.com/golang/crypto/commit/8ac0e0d97ce45cd83d1d7243c060cb8461dda5e9).
- Upgrade [`golang.org/x/net`](https://github.com/golang/net) from [**`net@66aacef3d`**](https://github.com/golang/net/commit/66aacef3dd8a676686c7ae3716979581e8b03c47) to [**`net@db08ff08e`**](https://github.com/golang/net/commit/db08ff08e8622530d9ed3a0e8ac279f6d4c02196).
- Upgrade [`golang.org/x/sys`](https://github.com/golang/sys) from [**`sys@ebfc5b463`**](https://github.com/golang/sys/commit/ebfc5b4631820b793c9010c87fd8fef0f39eb082) to [**`sys@56ede360e`**](https://github.com/golang/sys/commit/56ede360ec1c541828fb88741b3f1049406d28f5).
- Upgrade [`golang.org/x/text`](https://github.com/golang/text) from [**`text@b19bf474d`**](https://github.com/golang/text/commit/b19bf474d317b857955b12035d2c5acb57ce8b01) to [**`text@f21a4dfb5`**](https://github.com/golang/text/commit/f21a4dfb5e38f5895301dc265a8def02365cc3d0).
- Upgrade [`golang.org/x/time`](https://github.com/golang/time) from [**`time@c06e80d93`**](https://github.com/golang/time/commit/c06e80d9300e4443158a03817b8a8cb37d230320) to [**`time@fbb02b229`**](https://github.com/golang/time/commit/fbb02b2291d28baffd63558aa44b4b56f178d650).
- Upgrade [`github.com/golang/protobuf`](https://github.com/golang/protobuf/releases) from [**`golang/protobuf@1e59b77b5`**](https://github.com/golang/protobuf/commit/1e59b77b52bf8e4b449a57e6f79f21226d571845) to [**`v1.1.0`**](https://github.com/golang/protobuf/releases/tag/v1.1.0).
- Upgrade [`gopkg.in/yaml.v2`](https://github.com/go-yaml/yaml/releases) from [**`yaml@cd8b52f82`**](https://github.com/go-yaml/yaml/commit/cd8b52f8269e0feb286dfeef29f8fe4d5b397e0b) to [**`yaml@5420a8b67`**](https://github.com/go-yaml/yaml/commit/5420a8b6744d3b0345ab293f6fcba19c978f1183).
- Upgrade [`github.com/dgrijalva/jwt-go`](https://github.com/dgrijalva/jwt-go/releases) from [**`v3.0.0`**](https://github.com/dgrijalva/jwt-go/releases/tag/v3.0.0) to [**`v3.2.0`**](https://github.com/dgrijalva/jwt-go/releases/tag/v3.2.0).
- Upgrade [`github.com/ugorji/go/codec`](https://github.com/ugorji/go/releases) to [**`v1.1.1`**](https://github.com/ugorji/go/releases/tag/v1.1.1), and [regenerate v2 `client`](https://github.com/coreos/etcd/pull/9494).
- Upgrade [`github.com/soheilhy/cmux`](https://github.com/soheilhy/cmux/releases) from [**`v0.1.3`**](https://github.com/soheilhy/cmux/releases/tag/v0.1.3) to [**`v0.1.4`**](https://github.com/soheilhy/cmux/releases/tag/v0.1.4).
- Upgrade [`github.com/google/btree`](https://github.com/google/btree/releases) from [**`google/btree@925471ac9`**](https://github.com/google/btree/commit/925471ac9e2131377a91e1595defec898166fe49) to [**`google/btree@e89373fe6`**](https://github.com/google/btree/commit/e89373fe6b4a7413d7acd6da1725b83ef713e6e4).
- Upgrade [`github.com/spf13/cobra`](https://github.com/spf13/cobra/releases) from [**`spf13/cobra@1c44ec8d3`**](https://github.com/spf13/cobra/commit/1c44ec8d3f1552cac48999f9306da23c4d8a288b) to [**`v0.0.3`**](https://github.com/spf13/cobra/releases/tag/v0.0.3).
- Upgrade [`github.com/spf13/pflag`](https://github.com/spf13/pflag/releases) from [**`v1.0.0`**](https://github.com/spf13/pflag/releases/tag/v1.0.0) to [**`spf13/pflag@1ce0cc6db`**](https://github.com/spf13/pflag/commit/1ce0cc6db4029d97571db82f85092fccedb572ce).
- Upgrade [`github.com/coreos/go-systemd`](https://github.com/coreos/go-systemd/releases) from [**`v15`**](https://github.com/coreos/go-systemd/releases/tag/v15) to [**`v17`**](https://github.com/coreos/go-systemd/releases/tag/v17).
- Upgrade [`github.com/prometheus/client_golang`](https://github.com/prometheus/client_golang/releases) from [**``prometheus/client_golang@5cec1d042``**](https://github.com/prometheus/client_golang/commit/5cec1d0429b02e4323e042eb04dafdb079ddf568) to [**`v0.8.0`**](https://github.com/prometheus/client_golang/releases/tag/v0.8.0).
- Upgrade [`github.com/grpc-ecosystem/go-grpc-prometheus`](https://github.com/grpc-ecosystem/go-grpc-prometheus/releases) from [**``grpc-ecosystem/go-grpc-prometheus@0dafe0d49``**](https://github.com/grpc-ecosystem/go-grpc-prometheus/commit/0dafe0d496ea71181bf2dd039e7e3f44b6bd11a7) to [**`v1.2.0`**](https://github.com/grpc-ecosystem/go-grpc-prometheus/releases/tag/v1.2.0).
- Upgrade [`github.com/grpc-ecosystem/grpc-gateway`](https://github.com/grpc-ecosystem/grpc-gateway/releases) from [**`v1.3.1`**](https://github.com/grpc-ecosystem/grpc-gateway/releases/tag/v1.3.1) to [**`v1.4.1`**](https://github.com/grpc-ecosystem/grpc-gateway/releases/tag/v1.4.1).
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#latest) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_network_active_peers`](https://github.com/coreos/etcd/pull/9762) Prometheus metric.
- Let's say `"7339c4e5e833c029"` server `/metrics` returns `etcd_network_active_peers{Local="7339c4e5e833c029",Remote="729934363faa4a24"} 1` and `etcd_network_active_peers{Local="7339c4e5e833c029",Remote="b548c2511513015"} 1`. This indicates that the local node `"7339c4e5e833c029"` currently has two active remote peers `"729934363faa4a24"` and `"b548c2511513015"` in a 3-node cluster. If the node `"b548c2511513015"` is down, the local node `"7339c4e5e833c029"` will show `etcd_network_active_peers{Local="7339c4e5e833c029",Remote="729934363faa4a24"} 1` and `etcd_network_active_peers{Local="7339c4e5e833c029",Remote="b548c2511513015"} 0`.
- Add [`etcd_network_disconnected_peers_total`](https://github.com/coreos/etcd/pull/9762) Prometheus metric.
- If a remote peer `"b548c2511513015"` is down, the local node `"7339c4e5e833c029"` server `/metrics` would return `etcd_network_disconnected_peers_total{Local="7339c4e5e833c029",Remote="b548c2511513015"} 1`, while active peer metrics will show `etcd_network_active_peers{Local="7339c4e5e833c029",Remote="729934363faa4a24"} 1` and `etcd_network_active_peers{Local="7339c4e5e833c029",Remote="b548c2511513015"} 0`.
- Add [`etcd_network_server_stream_failures_total`](https://github.com/coreos/etcd/pull/9760) Prometheus metric.
- e.g. `etcd_network_server_stream_failures_total{API="lease-keepalive",Type="receive"} 1`
- e.g. `etcd_network_server_stream_failures_total{API="watch",Type="receive"} 1`
- Increase [`etcd_network_peer_round_trip_time_seconds`](https://github.com/coreos/etcd/pull/9762) Prometheus metric histogram upper-bound.
- Previously, highest bucket only collects requests taking 0.8192 seconds or more.
- Now, highest buckets collect 0.8192 seconds, 1.6384 seconds, and 3.2768 seconds or more.
- Add [`etcd_server_is_leader`](https://github.com/coreos/etcd/pull/9587) Prometheus metric.
- Add [`etcd_server_version`](https://github.com/coreos/etcd/pull/8960) Prometheus metric.
- To replace [Kubernetes `etcd-version-monitor`](https://github.com/coreos/etcd/issues/8948).
- Add [`etcd_server_go_version`](https://github.com/coreos/etcd/pull/9957) Prometheus metric.
- Add [`etcd_server_heartbeat_send_failures_total`](https://github.com/coreos/etcd/pull/9761) Prometheus metric.
- Add [`etcd_server_slow_apply_total`](https://github.com/coreos/etcd/pull/9761) Prometheus metric.
- Add [`etcd_server_slow_read_indexes_total`](https://github.com/coreos/etcd/pull/9897) Prometheus metric.
- Add [`etcd_server_quota_backend_bytes`](https://github.com/coreos/etcd/pull/9820) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
- Add [`etcd_mvcc_db_total_size_in_bytes`](https://github.com/coreos/etcd/pull/9819) Prometheus metric.
- Renamed from [`etcd_debugging_mvcc_db_total_size_in_bytes`](https://github.com/coreos/etcd/pull/9819).
- Add [`etcd_mvcc_db_total_size_in_use_in_bytes`](https://github.com/coreos/etcd/pull/9256) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
- Add [`etcd_snap_fsync_duration_seconds`](https://github.com/coreos/etcd/pull/9762) Prometheus metric.
- Add [`etcd_disk_backend_defrag_duration_seconds`](https://github.com/coreos/etcd/pull/9761) Prometheus metric.
- Add [`etcd_mvcc_hash_duration_seconds`](https://github.com/coreos/etcd/pull/9761) Prometheus metric.
- Add [`etcd_mvcc_hash_rev_duration_seconds`](https://github.com/coreos/etcd/pull/9761) Prometheus metric.
- Add [`etcd_debugging_disk_backend_commit_rebalance_duration_seconds`](https://github.com/coreos/etcd/pull/9834) Prometheus metric.
- Add [`etcd_debugging_disk_backend_commit_spill_duration_seconds`](https://github.com/coreos/etcd/pull/9834) Prometheus metric.
- Add [`etcd_debugging_disk_backend_commit_write_duration_seconds`](https://github.com/coreos/etcd/pull/9834) Prometheus metric.
- Add [`etcd_debugging_lease_granted_total`](https://github.com/coreos/etcd/pull/9778) Prometheus metric.
- Add [`etcd_debugging_lease_revoked_total`](https://github.com/coreos/etcd/pull/9778) Prometheus metric.
- Add [`etcd_debugging_lease_renewed_total`](https://github.com/coreos/etcd/pull/9778) Prometheus metric.
- Add [`etcd_debugging_lease_ttl_total`](https://github.com/coreos/etcd/pull/9778) Prometheus metric.
- Increase [`etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds`](https://github.com/coreos/etcd/pull/9762) Prometheus metric histogram upper-bound.
- Previously, highest bucket only collects requests taking 1.024 seconds or more.
- Now, highest buckets collect 1.024 seconds, 2.048 seconds, and 4.096 seconds or more.
- Fix missing [`etcd_network_peer_sent_failures_total`](https://github.com/coreos/etcd/pull/9437) Prometheus metric count.
- Fix [`etcd_debugging_server_lease_expired_total`](https://github.com/coreos/etcd/pull/9557) Prometheus metric.
- Fix [race conditions in v2 server stat collecting](https://github.com/coreos/etcd/pull/9562).
### Security, Authentication
See [security doc](https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md) for more details.
- Support TLS cipher suite whitelisting.
- To block [weak cipher suites](https://github.com/coreos/etcd/issues/8320).
- TLS handshake fails when client hello is requested with invalid cipher suites.
- Add [`etcd --cipher-suites`](https://github.com/coreos/etcd/pull/9801) flag.
- If empty, Go auto-populates the list.
- Add [`etcd --host-whitelist`](https://github.com/coreos/etcd/pull/9372) flag, [`etcdserver.Config.HostWhitelist`](https://github.com/coreos/etcd/pull/9372), and [`embed.Config.HostWhitelist`](https://github.com/coreos/etcd/pull/9372), to prevent ["DNS Rebinding"](https://en.wikipedia.org/wiki/DNS_rebinding) attack.
- Any website can simply create an authorized DNS name, and direct DNS to `"localhost"` (or any other address). Then, all HTTP endpoints of etcd server listening on `"localhost"` becomes accessible, thus vulnerable to [DNS rebinding attacks (CVE-2018-5702)](https://bugs.chromium.org/p/project-zero/issues/detail?id=1447#c2).
- Client origin enforce policy works as follow:
- If client connection is secure via HTTPS, allow any hostnames..
- If client connection is not secure and `"HostWhitelist"` is not empty, only allow HTTP requests whose Host field is listed in whitelist.
- By default, `"HostWhitelist"` is `"*"`, which means insecure server allows all client HTTP requests.
- Note that the client origin policy is enforced whether authentication is enabled or not, for tighter controls.
- When specifying hostnames, loopback addresses are not added automatically. To allow loopback interfaces, add them to whitelist manually (e.g. `"localhost"`, `"127.0.0.1"`, etc.).
- e.g. `etcd --host-whitelist example.com`, then the server will reject all HTTP requests whose Host field is not `example.com` (also rejects requests to `"localhost"`).
- Support [`etcd --cors`](https://github.com/coreos/etcd/pull/9490) in v3 HTTP requests (gRPC gateway).
- Support [`ttl` field for `etcd` Authentication JWT token](https://github.com/coreos/etcd/pull/8302).
- e.g. `etcd --auth-token jwt,pub-key=<pub key path>,priv-key=<priv key path>,sign-method=<sign method>,ttl=5m`.
- Allow empty token provider in [`etcdserver.ServerConfig.AuthToken`](https://github.com/coreos/etcd/pull/9369).
- Fix [TLS reload](https://github.com/coreos/etcd/pull/9570) when [certificate SAN field only includes IP addresses but no domain names](https://github.com/coreos/etcd/issues/9541).
- In Go, server calls `(*tls.Config).GetCertificate` for TLS reload if and only if server's `(*tls.Config).Certificates` field is not empty, or `(*tls.ClientHelloInfo).ServerName` is not empty with a valid SNI from the client. Previously, etcd always populates `(*tls.Config).Certificates` on the initial client TLS handshake, as non-empty. Thus, client was always expected to supply a matching SNI in order to pass the TLS verification and to trigger `(*tls.Config).GetCertificate` to reload TLS assets.
- However, a certificate whose SAN field does [not include any domain names but only IP addresses](https://github.com/coreos/etcd/issues/9541) would request `*tls.ClientHelloInfo` with an empty `ServerName` field, thus failing to trigger the TLS reload on initial TLS handshake; this becomes a problem when expired certificates need to be replaced online.
- Now, `(*tls.Config).Certificates` is created empty on initial TLS client handshake, first to trigger `(*tls.Config).GetCertificate`, and then to populate rest of the certificates on every new TLS connection, even when client SNI is empty (e.g. cert only includes IPs).
### etcd server
- Add [`etcd --initial-election-tick-advance`](https://github.com/coreos/etcd/pull/9591) flag to configure initial election tick fast-forward.
- By default, `etcd --initial-election-tick-advance=true`, then local member fast-forwards election ticks to speed up "initial" leader election trigger.
- This benefits the case of larger election ticks. For instance, cross datacenter deployment may require longer election timeout of 10-second. If true, local node does not need wait up to 10-second. Instead, forwards its election ticks to 8-second, and have only 2-second left before leader election.
- Major assumptions are that: cluster has no active leader thus advancing ticks enables faster leader election. Or cluster already has an established leader, and rejoining follower is likely to receive heartbeats from the leader after tick advance and before election timeout.
- However, when network from leader to rejoining follower is congested, and the follower does not receive leader heartbeat within left election ticks, disruptive election has to happen thus affecting cluster availabilities.
- Now, this can be disabled by setting `etcd --initial-election-tick-advance=false`.
- Disabling this would slow down initial bootstrap process for cross datacenter deployments. Make tradeoffs by configuring `etcd --initial-election-tick-advance` at the cost of slow initial bootstrap.
- If single-node, it advances ticks regardless.
- Address [disruptive rejoining follower node](https://github.com/coreos/etcd/issues/9333).
- Add [`etcd --pre-vote`](https://github.com/coreos/etcd/pull/9352) flag to enable to run an additional Raft election phase.
- For instance, a flaky(or rejoining) member may drop in and out, and start campaign. This member will end up with a higher term, and ignore all incoming messages with lower term. In this case, a new leader eventually need to get elected, thus disruptive to cluster availability. Raft implements Pre-Vote phase to prevent this kind of disruptions. If enabled, Raft runs an additional phase of election to check if pre-candidate can get enough votes to win an election.
- `etcd --pre-vote=false` by default.
- v3.5 will enable `etcd --pre-vote=true` by default.
- [`etcd --initial-corrupt-check`](TODO) flag is now stable (`etcd --experimental-initial-corrupt-check`haisbeen deprecated).
- `etcd --initial-corrupt-check=true` by default, to check cluster database hashes before serving client/peer traffic.
- [`etcd --corrupt-check-time`](TODO) flag is now stable (`etcd --experimental-corrupt-check-time`haisbeen deprecated).
- `etcd --corrupt-check-time=12h` by default, to check cluster database hashes for every 12-hour.
- [`etcd --enable-v2v3`](TODO) flag is now stable.
- `etcd --experimental-enable-v2v3` has been deprecated.
- Added [more v2v3 integration tests](https://github.com/coreos/etcd/pull/9634).
- `etcd --enable-v2=true --enable-v2v3=''` by default, to enable v2 API server that is backed by **v2 store**.
- `etcd --enable-v2=true --enable-v2v3=/aaa` to enable v2 API server that is backed by **v3 storage**.
- `etcd --enable-v2=false --enable-v2v3=''` to disable v2 API server.
- `etcd --enable-v2=false --enable-v2v3=/aaa` to disable v2 API server. TODO: error?
- Automatically [create parent directory if it does not exist](https://github.com/coreos/etcd/pull/9626) (fix [issue#9609](https://github.com/coreos/etcd/issues/9609)).
- v4.0 will configure `etcd --enable-v2=true --enable-v2v3=/aaa` to enable v2 API server that is backed by **v3 storage**.
- Add [`etcd --discovery-srv-name`](https://github.com/coreos/etcd/pull/8690) flag to support custom DNS SRV name with discovery.
- If not given, etcd queries `_etcd-server-ssl._tcp.[YOUR_HOST]` and `_etcd-server._tcp.[YOUR_HOST]`.
- If `etcd --discovery-srv-name="foo"`, then query `_etcd-server-ssl-foo._tcp.[YOUR_HOST]` and `_etcd-server-foo._tcp.[YOUR_HOST]`.
- Useful for operating multiple etcd clusters under the same domain.
- Support TLS cipher suite whitelisting.
- To block [weak cipher suites](https://github.com/coreos/etcd/issues/8320).
- TLS handshake fails when client hello is requested with invalid cipher suites.
- Add [`etcd --cipher-suites`](https://github.com/coreos/etcd/pull/9801) flag.
- If empty, Go auto-populates the list.
- Support [`etcd --cors`](https://github.com/coreos/etcd/pull/9490) in v3 HTTP requests (gRPC gateway).
- Rename [`etcd --log-output` to `etcd --log-outputs`](https://github.com/coreos/etcd/pull/9624) to support multiple log outputs.
- **`etcd --log-output` will be deprecated in v3.5**.
- Add [`etcd --logger`](https://github.com/coreos/etcd/pull/9572) flag to support [structured logger and multiple log outputs](https://github.com/coreos/etcd/issues/9438) in server-side.
- **`etcd --logger=capnslog` will be deprecated in v3.5**.
- Main motivation is to promote automated etcd monitoring, rather than looking back server logs when it starts breaking. Future development will make etcd log as few as possible, and make etcd easier to monitor with metrics and alerts.
- `etcd --logger=capnslog --log-outputs=default` is the default setting and same as previous etcd server logging format.
- `etcd --logger=zap --log-outputs=default` is not supported when `etcd --logger=zap`.
- Instead, use `etcd --logger=zap --log-outputs=stderr`.
- Or, use `etcd --logger=zap --log-outputs=systemd/journal` to send logs to the local systemd journal.
- Previously, if etcd parent process ID (PPID) is 1 (e.g. run with systemd), `etcd --logger=capnslog --log-outputs=default` redirects server logs to local systemd journal. And if write to journald fails, it writes to `os.Stderr` as a fallback.
- However, even with PPID 1, it can fail to dial systemd journal (e.g. run embedded etcd with Docker container). Then, [every single log write will fail](https://github.com/coreos/etcd/pull/9729) and fall back to `os.Stderr`, which is inefficient.
- To avoid this problem, systemd journal logging must be configured manually.
- `etcd --logger=zap --log-outputs=stderr` will log server operations in [JSON-encoded format](https://godoc.org/go.uber.org/zap#NewProductionEncoderConfig) and writes logs to `os.Stderr`. Use this to override journald log redirects.
- `etcd --logger=zap --log-outputs=stdout` will log server operations in [JSON-encoded format](https://godoc.org/go.uber.org/zap#NewProductionEncoderConfig) and writes logs to `os.Stdout` Use this to override journald log redirects.
- `etcd --logger=zap --log-outputs=a.log` will log server operations in [JSON-encoded format](https://godoc.org/go.uber.org/zap#NewProductionEncoderConfig) and writes logs to the specified file `a.log`.
- `etcd --logger=zap --log-outputs=a.log,b.log,c.log,stdout` [writes server logs to multiple files `a.log`, `b.log` and `c.log` at the same time](https://github.com/coreos/etcd/pull/9579) and outputs to `os.Stderr`, in [JSON-encoded format](https://godoc.org/go.uber.org/zap#NewProductionEncoderConfig).
- `etcd --logger=zap --log-outputs=/dev/null` will discard all server logs.
- Fix [`mvcc` "unsynced" watcher restore operation](https://github.com/coreos/etcd/pull/9281).
- "unsynced" watcher is watcher that needs to be in sync with events that have happened.
- That is, "unsynced" watcher is the slow watcher that was requested on old revision.
- "unsynced" watcher restore operation was not correctly populating its underlying watcher group.
- Which possibly causes [missing events from "unsynced" watchers](https://github.com/coreos/etcd/issues/9086).
- A node gets network partitioned with a watcher on a future revision, and falls behind receiving a leader snapshot after partition gets removed. When applying this snapshot, etcd watch storage moves current synced watchers to unsynced since sync watchers might have become stale during network partition. And reset synced watcher group to restart watcher routines. Previously, there was a bug when moving from synced watcher group to unsynced, thus client would miss events when the watcher was requested to the network-partitioned node.
- Fix [`mvcc` server panic from restore operation](https://github.com/coreos/etcd/pull/9775).
- Let's assume that a watcher had been requested with a future revision X and sent to node A that became network-partitioned thereafter. Meanwhile, cluster makes progress. Then when the partition gets removed, the leader sends a snapshot to node A. Previously if the snapshot's latest revision is still lower than the watch revision X, **etcd server panicked** during snapshot restore operation.
- Now, this server-side panic has been fixed.
- Fix [server panic on invalid Election Proclaim/Resign HTTP(S) requests](https://github.com/coreos/etcd/pull/9379).
- Previously, wrong-formatted HTTP requests to Election API could trigger panic in etcd server.
- e.g. `curl -L http://localhost:2379/v3/election/proclaim -X POST -d '{"value":""}'`, `curl -L http://localhost:2379/v3/election/resign -X POST -d '{"value":""}'`.
- Fix [revision-based compaction retention parsing](https://github.com/coreos/etcd/pull/9339).
- Previously, `etcd --auto-compaction-mode revision --auto-compaction-retention 1` was [translated to revision retention 3600000000000](https://github.com/coreos/etcd/issues/9337).
- Now, `etcd --auto-compaction-mode revision --auto-compaction-retention 1` is correctly parsed as revision retention 1.
- Prevent [overflow by large `TTL` values for `Lease` `Grant`](https://github.com/coreos/etcd/pull/9399).
- `TTL` parameter to `Grant` request is unit of second.
- Leases with too large `TTL` values exceeding `math.MaxInt64` [expire in unexpected ways](https://github.com/coreos/etcd/issues/9374).
- Server now returns `rpctypes.ErrLeaseTTLTooLarge` to client, when the requested `TTL` is larger than *9,000,000,000 seconds* (which is >285 years).
- Again, etcd `Lease` is meant for short-periodic keepalives or sessions, in the range of seconds or minutes. Not for hours or days!
- Enable etcd server [`raft.Config.CheckQuorum` when starting with `ForceNewCluster`](https://github.com/coreos/etcd/pull/9347).
- Allow [non-WAL files in `etcd --wal-dir` directory](https://github.com/coreos/etcd/pull/9743).
- Previously, existing files such as [`lost+found`](https://github.com/coreos/etcd/issues/7287) in WAL directory prevent etcd server boot.
- Now, WAL directory that contains only `lost+found` or a file that's not suffixed with `.wal` is considered non-initialized.
### API
- Add [`snapshot`](https://github.com/coreos/etcd/pull/9118) package for snapshot restore/save operations (see [`godoc.org/github.com/etcd/clientv3/snapshot`](https://godoc.org/github.com/coreos/etcd/clientv3/snapshot) for more).
- Add [`watch_id` field to `etcdserverpb.WatchCreateRequest`](https://github.com/coreos/etcd/pull/9065) to allow user-provided watch ID to `mvcc`.
- Corresponding `watch_id` is returned via `etcdserverpb.WatchResponse`, if any.
- Add [`fragment` field to `etcdserverpb.WatchCreateRequest`](https://github.com/coreos/etcd/pull/9291) to request etcd server to [split watch events](https://github.com/coreos/etcd/issues/9294) when the total size of events exceeds `etcd --max-request-bytes` flag value plus gRPC-overhead 512 bytes.
- The default server-side request bytes limit is `embed.DefaultMaxRequestBytes` which is 1.5 MiB plus gRPC-overhead 512 bytes.
- If watch response events exceed this server-side request limit and watch request is created with `fragment` field `true`, the server will split watch events into a set of chunks, each of which is a subset of watch events below server-side request limit.
- Useful when client-side has limited bandwidths.
- For example, watch response contains 10 events, where each event is 1 MiB. And server `etcd --max-request-bytes` flag value is 1 MiB. Then, server will send 10 separate fragmented events to the client.
- For example, watch response contains 5 events, where each event is 2 MiB. And server `etcd --max-request-bytes` flag value is 1 MiB and `clientv3.Config.MaxCallRecvMsgSize` is 1 MiB. Then, server will try to send 5 separate fragmented events to the client, and the client will error with `"code = ResourceExhausted desc = grpc: received message larger than max (...)"`.
- Client must implement fragmented watch event merge (which `clientv3` does in etcd v3.4).
- Add [`raftAppliedIndex` field to `etcdserverpb.StatusResponse`](https://github.com/coreos/etcd/pull/9176) for current Raft applied index.
- Add [`errors` field to `etcdserverpb.StatusResponse`](https://github.com/coreos/etcd/pull/9206) for server-side error.
- e.g. `"etcdserver: no leader", "NOSPACE", "CORRUPT"`
- Add [`dbSizeInUse` field to `etcdserverpb.StatusResponse`](https://github.com/coreos/etcd/pull/9256) for actual DB size after compaction.
- Add [`WatchRequest.WatchProgressRequest`](https://github.com/coreos/etcd/pull/9869).
- To manually trigger broadcasting watch progress event (empty watch response with latest header) to all associated watch streams.
- Think of it as `WithProgressNotify` that can be triggered manually.
Note: **v3.5 will deprecate `etcd --log-package-levels` flag for `capnslog`**; `etcd --logger=zap --log-outputs=stderr` will the default. **v3.5 will deprecate `[CLIENT-URL]/config/local/log` endpoint.**
### Package `embed`
- Add [`embed.Config.CipherSuites`](https://github.com/coreos/etcd/pull/9801) to specify a list of supported cipher suites for TLS handshake between client/server and peers.
- If empty, Go auto-populates the list.
- Both `embed.Config.ClientTLSInfo.CipherSuites` and `embed.Config.CipherSuites` cannot be non-empty at the same time.
- If not empty, specify either `embed.Config.ClientTLSInfo.CipherSuites` or `embed.Config.CipherSuites`.
- Add [`embed.Config.InitialElectionTickAdvance`](https://github.com/coreos/etcd/pull/9591) to enable/disable initial election tick fast-forward.
- `embed.NewConfig()` would return `*embed.Config` with `InitialElectionTickAdvance` as true by default.
- Define [`embed.CompactorModePeriodic`](https://godoc.org/github.com/coreos/etcd/embed#pkg-variables) for `compactor.ModePeriodic`.
- Define [`embed.CompactorModeRevision`](https://godoc.org/github.com/coreos/etcd/embed#pkg-variables) for `compactor.ModeRevision`.
- Change [`embed.Config.CorsInfo` in `*cors.CORSInfo` type to `embed.Config.CORS` in `map[string]struct{}` type](https://github.com/coreos/etcd/pull/9490).
- Remove [`embed.Config.SetupLogging`](https://github.com/coreos/etcd/pull/9572).
- Now logger is set up automatically based on [`embed.Config.Logger`, `embed.Config.LogOutputs`, `embed.Config.Debug` fields](https://github.com/coreos/etcd/pull/9572).
- Add [`embed.Config.Logger`](https://github.com/coreos/etcd/pull/9518) to support [structured logger `zap`](https://github.com/uber-go/zap) in server-side.
- Rename `embed.Config.SnapCount` field to [`embed.Config.SnapshotCount`](https://github.com/coreos/etcd/pull/9745), to be consistent with the flag name `etcd --snapshot-count`.
- Rename [**`embed.Config.LogOutput`** to **`embed.Config.LogOutputs`**](https://github.com/coreos/etcd/pull/9624) to support multiple log outputs.
- Change [**`embed.Config.LogOutputs`** type from `string` to `[]string`](https://github.com/coreos/etcd/pull/9579) to support multiple log outputs.
### Package `integration`
- Add [`CLUSTER_DEBUG` to enable test cluster logging](https://github.com/coreos/etcd/pull/9678).
- Deprecated `capnslog` in integration tests.
### client v3
- Add [`WithFragment` `OpOption`](https://github.com/coreos/etcd/pull/9291) to support [watch events fragmentation](https://github.com/coreos/etcd/issues/9294) when the total size of events exceeds `etcd --max-request-bytes` flag value plus gRPC-overhead 512 bytes.
- Watch fragmentation is disabled by default.
- The default server-side request bytes limit is `embed.DefaultMaxRequestBytes` which is 1.5 MiB plus gRPC-overhead 512 bytes.
- If watch response events exceed this server-side request limit and watch request is created with `fragment` field `true`, the server will split watch events into a set of chunks, each of which is a subset of watch events below server-side request limit.
- Useful when client-side has limited bandwidths.
- For example, watch response contains 10 events, where each event is 1 MiB. And server `etcd --max-request-bytes` flag value is 1 MiB. Then, server will send 10 separate fragmented events to the client.
- For example, watch response contains 5 events, where each event is 2 MiB. And server `etcd --max-request-bytes` flag value is 1 MiB and `clientv3.Config.MaxCallRecvMsgSize` is 1 MiB. Then, server will try to send 5 separate fragmented events to the client, and the client will error with `"code = ResourceExhausted desc = grpc: received message larger than max (...)"`.
- Add [`Watcher.RequestProgress` method](https://github.com/coreos/etcd/pull/9869).
- To manually trigger broadcasting watch progress event (empty watch response with latest header) to all associated watch streams.
- Think of it as `WithProgressNotify` that can be triggered manually.
- Fix [lease keepalive interval updates when response queue is full](https://github.com/coreos/etcd/pull/9952).
- If `<-chan *clientv3LeaseKeepAliveResponse` from `clientv3.Lease.KeepAlive` was never consumed or channel is full, client was [sending keepalive request every 500ms](https://github.com/coreos/etcd/issues/9911) instead of expected rate of every "TTL / 3" duration.
- Change [snapshot file permissions](https://github.com/coreos/etcd/pull/9977): On Linux, the snapshot file changes from readable by all (mode 0644) to readable by the user only (mode 0600).
### etcdctl v3
- Make [`ETCDCTL_API=3 etcdctl` default](https://github.com/coreos/etcd/issues/9600).
- Now, `etcdctl set foo bar` must be `ETCDCTL_API=2 etcdctl set foo bar`.
- Now, `ETCDCTL_API=3 etcdctl put foo bar` could be just `etcdctl put foo bar`.
- Add [`etcdctl --password`](https://github.com/coreos/etcd/pull/9730) flag.
- To support [`:` character in user name](https://github.com/coreos/etcd/issues/9691).
- e.g. `etcdctl --user user --password password get foo`
- Add [`etcdctl user add --new-user-password`](https://github.com/coreos/etcd/pull/9730) flag.
- Add [`etcdctl check datascale`](https://github.com/coreos/etcd/pull/9185) command.
- Add [`etcdctl check datascale --auto-compact, --auto-defrag`](https://github.com/coreos/etcd/pull/9351) flags.
- Add [`etcdctl check perf --auto-compact, --auto-defrag`](https://github.com/coreos/etcd/pull/9330) flags.
- Add [`etcdctl defrag --cluster`](https://github.com/coreos/etcd/pull/9390) flag.
- Add ["raft applied index" field to `endpoint status`](https://github.com/coreos/etcd/pull/9176).
- Add ["errors" field to `endpoint status`](https://github.com/coreos/etcd/pull/9206).
- Add [`etcdctl endpoint health --write-out` support](https://github.com/coreos/etcd/pull/9540).
- Previously, [`etcdctl endpoint health --write-out json` did not work](https://github.com/coreos/etcd/issues/9532).
- Fix [`etcdctl watch [key] [range_end] -- [exec-command…]`](https://github.com/coreos/etcd/pull/9688) parsing.
- Previously, `ETCDCTL_API=3 etcdctl watch foo -- echo watch event received` panicked.
- Fix [`etcdctl move-leader` command for TLS-enabled endpoints](https://github.com/coreos/etcd/pull/9807).
- Add [`progress` command to `etcdctl watch --interactive`](https://github.com/coreos/etcd/pull/9869).
- To manually trigger broadcasting watch progress event (empty watch response with latest header) to all associated watch streams.
- Think of it as `WithProgressNotify` that can be triggered manually.
### gRPC proxy
- Fix [etcd server panic from restore operation](https://github.com/coreos/etcd/pull/9775).
- Let's assume that a watcher had been requested with a future revision X and sent to node A that became network-partitioned thereafter. Meanwhile, cluster makes progress. Then when the partition gets removed, the leader sends a snapshot to node A. Previously if the snapshot's latest revision is still lower than the watch revision X, **etcd server panicked** during snapshot restore operation.
- Especially, gRPC proxy was affected, since it detects a leader loss with a key `"proxy-namespace__lostleader"` and a watch revision `"int64(math.MaxInt64 - 2)"`.
- Now, this server-side panic has been fixed.
### gRPC gateway
- Replace [gRPC gateway](https://github.com/grpc-ecosystem/grpc-gateway) endpoint `/v3beta` with [`/v3`](https://github.com/coreos/etcd/pull/9298).
- Deprecated [`/v3alpha`](https://github.com/coreos/etcd/pull/9298).
- To deprecate [`/v3beta`](https://github.com/coreos/etcd/issues/9189) in v3.5.
- In v3.4, `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` still works as a fallback to `curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'`, but `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` won't work in v3.5. Use `curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` instead.
- Add API endpoints [`/{v3beta,v3}/lease/leases, /{v3beta,v3}/lease/revoke, /{v3beta,v3}/lease/timetolive`](https://github.com/coreos/etcd/pull/9450).
- To deprecate [`/{v3beta,v3}/kv/lease/leases, /{v3beta,v3}/kv/lease/revoke, /{v3beta,v3}/kv/lease/timetolive`](https://github.com/coreos/etcd/issues/9430) in v3.5.
- Support [`etcd --cors`](https://github.com/coreos/etcd/pull/9490) in v3 HTTP requests (gRPC gateway).
### Package `raft`
- Fix [deadlock during PreVote migration process](https://github.com/coreos/etcd/pull/8525).
- Add [`raft.ErrProposalDropped`](https://github.com/coreos/etcd/pull/9067).
- Now [`(r *raft) Step` returns `raft.ErrProposalDropped`](https://github.com/coreos/etcd/pull/9137) if a proposal has been ignored.
- e.g. a node is removed from cluster, or [`raftpb.MsgProp` arrives at current leader while there is an ongoing leadership transfer](https://github.com/coreos/etcd/issues/8975).
- Improve [Raft `becomeLeader` and `stepLeader`](https://github.com/coreos/etcd/pull/9073) by keeping track of latest `pb.EntryConfChange` index.
- Previously record `pendingConf` boolean field scanning the entire tail of the log, which can delay hearbeat send.
- Fix [missing learner nodes on `(n *node) ApplyConfChange`](https://github.com/coreos/etcd/pull/9116).
### Tooling
- Add [`etcd-dump-logs --entry-type`](https://github.com/coreos/etcd/pull/9628) flag to support WAL log filtering by entry type.
- Add [`etcd-dump-logs --stream-decoder`](https://github.com/coreos/etcd/pull/9790) flag to support custom decoder.
### Go
- Require *Go 1.10+*.
- Compile with [*Go 1.10.3*](https://golang.org/doc/devel/release.html#go1.10).

View File

@ -1,43 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.4](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.4.md).
## v3.5.0 (TBD 2018-12)
See [code changes](https://github.com/coreos/etcd/compare/v3.4.0...v3.5.0) and [v3.5 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_5.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.5 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_5.md).**
### Breaking Changes
- [gRPC gateway](https://github.com/grpc-ecosystem/grpc-gateway) only supports [`/v3`](TODO) endpoint.
- Deprecated [`/v3beta`](https://github.com/coreos/etcd/pull/9298).
- `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` does work in v3.5. Use `curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` instead.
- **`etcd --log-output` flag has been deprecated.** Use **`etcd --log-outputs`** instead.
- **`etcd --logger=zap --log-outputs=stderr`** is now the default.
- **`etcd --logger=capnslog` flag has been deprecated.**
- **`etcd --logger=zap --log-outputs=default` flag value is not supported.**.
- Instead, use `etcd --logger=zap --log-outputs=stderr`.
- Or, use `etcd --logger=zap --log-outputs=systemd/journal` to send logs to the local systemd journal.
- Previously, if etcd parent process ID (PPID) is 1 (e.g. run with systemd), `etcd --logger=capnslog --log-outputs=default` redirects server logs to local systemd journal. And if write to journald fails, it writes to `os.Stderr` as a fallback.
- However, even with PPID 1, it can fail to dial systemd journal (e.g. run embedded etcd with Docker container). Then, [every single log write will fail](https://github.com/coreos/etcd/pull/9729) and fall back to `os.Stderr`, which is inefficient.
- To avoid this problem, systemd journal logging must be configured manually.
- **`etcd --log-outputs=stderr`** is now the default.
- **`etcd --log-package-levels` flag for `capnslog` has been deprecated.** Now, **`etcd --logger=zap --log-outputs=stderr`** is the default.
- **`[CLIENT-URL]/config/local/log` endpoint has been deprecated, as is `etcd --log-package-levels` flag.**
- `curl http://127.0.0.1:2379/config/local/log -XPUT -d '{"Level":"DEBUG"}'` won't work.
- Please use `etcd --logger=zap --log-outputs=stderr` instead.
- Deprecated `etcd_debugging_mvcc_db_total_size_in_bytes` Prometheus metric. Instead, use `etcd_mvcc_db_total_size_in_bytes`.
### Metrics, Monitoring
See [List of metrics](https://etcd.readthedocs.io/en/latest/operate.html#latest) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Deprecated `etcd_debugging_mvcc_db_total_size_in_bytes` Prometheus metric. Instead, use `etcd_mvcc_db_total_size_in_bytes`.
### gRPC gateway
- [gRPC gateway](https://github.com/grpc-ecosystem/grpc-gateway) only supports [`/v3`](TODO) endpoint.
- Deprecated [`/v3beta`](https://github.com/coreos/etcd/pull/9298).
- `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` does work in v3.5. Use `curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` instead.

View File

@ -1,28 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.x](https://github.com/coreos/etcd/blob/master/CHANGELOG-3.x.md).
Planned breaking changes.
## v4.0.0 (TBD)
See [code changes](https://github.com/coreos/etcd/compare/v3.3.0...v4.0.0) and [v4.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_4_0.md) for any breaking changes. **Again, before running upgrades from any previous release, please make sure to read change logs below and [v4.0 upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_4_0.md).**
### Breaking Changes
- [Secure etcd by default](https://github.com/coreos/etcd/issues/9475)?
- Change `/health` endpoint output.
- Previously, `{"health":"true"}`.
- Now, `{"health":true}`.
- Breaks [Kubernetes `kubectl get componentstatuses` command](https://github.com/kubernetes/kubernetes/issues/58240).
- Deprecate [`etcd --proxy*`](TODO) flags; **no more v2 proxy**.
- Deprecate [v2 storage backend](https://github.com/coreos/etcd/issues/9232); **no more v2 store**.
- v2 API is still supported via [v2 emulation](TODO).
- Deprecate [`etcdctl backup`](TODO) command.
- `clientv3.Client.KeepAlive(ctx context.Context, id LeaseID) (<-chan *LeaseKeepAliveResponse, error)` is now [`clientv4.Client.KeepAlive(ctx context.Context, id LeaseID) <-chan *LeaseKeepAliveResponse`](TODO).
- Similar to `Watch`, [`KeepAlive` does not return errors](https://github.com/coreos/etcd/issues/7488).
- If there's an unknown server error, kill all open channels and create a new stream on the next `KeepAlive` call.
- Rename `github.com/coreos/client` to `github.com/coreos/clientv2`.

View File

@ -14,7 +14,7 @@ etcd is Apache 2.0 licensed and accepts contributions via GitHub pull requests.
## Reporting bugs and creating issues
Reporting bugs is one of the best ways to contribute. However, a good bug report has some very specific qualities, so please read over our short document on [reporting bugs](https://github.com/coreos/etcd/blob/master/Documentation/reporting_bugs.md) before submitting a bug report. This document might contain links to known issues, another good reason to take a look there before reporting a bug.
Reporting bugs is one of the best ways to contribute. However, a good bug report has some very specific qualities, so please read over our short document on [reporting bugs](https://github.com/etcd-io/etcd/blob/master/Documentation/reporting_bugs.md) before submitting a bug report. This document might contain links to known issues, another good reason to take a look there before reporting a bug.
## Contribution flow
@ -24,7 +24,7 @@ This is a rough outline of what a contributor's workflow looks like:
- Make commits of logical units.
- Make sure commit messages are in the proper format (see below).
- Push changes in a topic branch to a personal fork of the repository.
- Submit a pull request to coreos/etcd.
- Submit a pull request to etcd-io/etcd.
- The PR must receive a LGTM from two maintainers found in the MAINTAINERS file.
Thanks for contributing!
@ -42,9 +42,13 @@ questions: what changed and why. The subject line should feature the what and
the body of the commit should describe the why.
```
scripts: add the test-cluster command
etcdserver: add grpc interceptor to log info on incoming requests
this uses tmux to setup a test cluster that can easily be killed and started for debugging.
To improve debuggability of etcd v3. Added a grpc interceptor to log
info on incoming requests to etcd server. The log output includes
remote client info, request content (with value field redacted), request
handling latency, response size, etc. Uses zap logger if available,
otherwise uses capnslog.
Fixes #38
```
@ -52,11 +56,38 @@ Fixes #38
The format can be described more formally as follows:
```
<subsystem>: <what changed>
<package>: <what changed>
<BLANK LINE>
<why this change was made>
<BLANK LINE>
<footer>
```
The first line is the subject and should be no longer than 70 characters, the second line is always blank, and other lines should be wrapped at 80 characters. This allows the message to be easier to read on GitHub as well as in various git tools.
The first line is the subject and should be no longer than 70 characters, the second
line is always blank, and other lines should be wrapped at 80 characters. This allows
the message to be easier to read on GitHub as well as in various git tools.
### Pull request across multiple files and packages
If multiple files in a package are changed in a pull request for example:
```
etcdserver/config.go
etcdserver/corrupt.go
```
At the end of the review process if multiple commits exist for a single package they
should be squashed/rebased into a single commit before being merged.
```
etcdserver: <what changed>
[..]
```
If a pull request spans many packages these commits should be squashed/rebased into a single
commit using message with a more generic `*:` prefix.
```
*: <what changed>
[..]
```

View File

@ -1,4 +1,4 @@
FROM alpine:latest
FROM k8s.gcr.io/debian-base:v1.0.0
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/

View File

@ -1,4 +1,4 @@
FROM aarch64/ubuntu:16.04
FROM k8s.gcr.io/debian-base-arm64:v1.0.0
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/

View File

@ -1,4 +1,4 @@
FROM ppc64le/alpine:latest
FROM k8s.gcr.io/debian-base-ppc64le:v1.0.0
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/

View File

@ -1,3 +1,7 @@
---
title: Benchmarking etcd v2.1.0
---
## Physical machines
GCE n1-highcpu-2 machine type

View File

@ -1,4 +1,6 @@
# Benchmarking etcd v2.2.0
---
title: Benchmarking etcd v2.2.0
---
## Physical Machines

View File

@ -1,4 +1,8 @@
## Physical machines
---
title: Benchmarking etcd v2.2.0-rc
---
## Physical machine
GCE n1-highcpu-2 machine type

View File

@ -1,3 +1,7 @@
---
title: Benchmarking etcd v2.2.0-rc-memory
---
## Physical machine
GCE n1-standard-2 machine type

View File

@ -1,3 +1,7 @@
---
title: Benchmarking etcd v3
---
## Physical machines
GCE n1-highcpu-2 machine type

View File

@ -1,4 +1,6 @@
# Watch Memory Usage Benchmark
---
title: Watch Memory Usage Benchmark
---
*NOTE*: The watch features are under active development, and their memory usage may change as that development progresses. We do not expect it to significantly increase beyond the figures stated below.

View File

@ -1,4 +1,6 @@
# Storage Memory Usage Benchmark
---
title: Storage Memory Usage Benchmark
---
<!---todo: link storage to storage design doc-->
Two components of etcd storage consume physical memory. The etcd process allocates an *in-memory index* to speed key lookup. The process's *page cache*, managed by the operating system, stores recently-accessed data from disk for quick re-use.

View File

@ -1,4 +1,6 @@
# Branch management
---
title: Branch management
---
## Guide

View File

@ -1,4 +1,6 @@
# Demo
---
title: Demo
---
This series of examples shows the basic procedures for working with an etcd cluster.
@ -269,7 +271,10 @@ etcdctl --endpoints=$ENDPOINTS endpoint health
<img src="https://storage.googleapis.com/etcd/demo/11_etcdctl_snapshot_2016051001.gif" alt="11_etcdctl_snapshot_2016051001"/>
Snapshot can only be requested from one etcd node, so `--endpoints` flag should contain only one endpoint.
```
ENDPOINTS=$HOST_1:2379
etcdctl --endpoints=$ENDPOINTS snapshot save my.db
Snapshot saved at my.db

View File

@ -1,5 +1,6 @@
## Why gRPC gateway
---
title: Why gRPC gateway
---
etcd v3 uses [gRPC][grpc] for its messaging protocol. The etcd project includes a gRPC-based [Go client][go-client] and a command line utility, [etcdctl][etcdctl], for communicating with an etcd cluster through gRPC. For languages with no gRPC support, etcd provides a JSON [gRPC gateway][grpc-gateway]. This gateway serves a RESTful proxy that translates HTTP/JSON requests into gRPC messages.
@ -18,6 +19,8 @@ gRPC gateway endpoint has changed since etcd v3.3:
- etcd v3.5 or later uses only `[CLIENT-URL]/v3/*`.
- **`[CLIENT-URL]/v3beta/*` is deprecated**.
gRPC-gateway does not support authentication using TLS Common Name.
### Put and get keys
Use the `/v3/kv/range` and `/v3/kv/put` services to read and write keys:
@ -48,7 +51,7 @@ curl -L http://localhost:2379/v3/kv/range \
Use the `/v3/watch` service to watch keys:
```bash
curl http://localhost:2379/v3/watch \
curl -N http://localhost:2379/v3/watch \
-X POST -d '{"create_request": {"key":"Zm9v"} }' &
# {"result":{"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"1","raft_term":"2"},"created":true}}
@ -62,12 +65,21 @@ curl -L http://localhost:2379/v3/kv/put \
Issue a transaction with `/v3/kv/txn`:
```bash
# target CREATE
curl -L http://localhost:2379/v3/kv/txn \
-X POST \
-d '{"compare":[{"target":"CREATE","key":"Zm9v","createRevision":"2"}],"success":[{"requestPut":{"key":"Zm9v","value":"YmFy"}}]}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"3","raft_term":"2"},"succeeded":true,"responses":[{"response_put":{"header":{"revision":"3"}}}]}
```
```bash
# target VERSION
curl -L http://localhost:2379/v3/kv/txn \
-X POST \
-d '{"compare":[{"version":"4","result":"EQUAL","target":"VERSION","key":"Zm9v"}],"success":[{"requestRange":{"key":"Zm9v"}}]}'
# {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"6","raft_term":"3"},"succeeded":true,"responses":[{"response_range":{"header":{"revision":"6"},"kvs":[{"key":"Zm9v","create_revision":"2","mod_revision":"6","version":"4","value":"YmF6"}],"count":"1"}}]}
```
### Authentication
Set up authentication with the `/v3/auth` service:
@ -118,7 +130,7 @@ Generated [Swagger][swagger] API definitions can be found at [rpc.swagger.json][
[api-ref]: ./api_reference_v3.md
[go-client]: https://github.com/coreos/etcd/tree/master/clientv3
[etcdctl]: https://github.com/coreos/etcd/tree/master/etcdctl
[grpc]: http://www.grpc.io/
[grpc]: https://www.grpc.io/
[grpc-gateway]: https://github.com/grpc-ecosystem/grpc-gateway
[json-mapping]: https://developers.google.com/protocol-buffers/docs/proto3#json
[swagger]: http://swagger.io/

View File

@ -11,14 +11,14 @@ This is a generated documentation. Please read the proto files for more.
| AuthEnable | AuthEnableRequest | AuthEnableResponse | AuthEnable enables authentication. |
| AuthDisable | AuthDisableRequest | AuthDisableResponse | AuthDisable disables authentication. |
| Authenticate | AuthenticateRequest | AuthenticateResponse | Authenticate processes an authenticate request. |
| UserAdd | AuthUserAddRequest | AuthUserAddResponse | UserAdd adds a new user. |
| UserAdd | AuthUserAddRequest | AuthUserAddResponse | UserAdd adds a new user. User name cannot be empty. |
| UserGet | AuthUserGetRequest | AuthUserGetResponse | UserGet gets detailed user information. |
| UserList | AuthUserListRequest | AuthUserListResponse | UserList gets a list of all users. |
| UserDelete | AuthUserDeleteRequest | AuthUserDeleteResponse | UserDelete deletes a specified user. |
| UserChangePassword | AuthUserChangePasswordRequest | AuthUserChangePasswordResponse | UserChangePassword changes the password of a specified user. |
| UserGrantRole | AuthUserGrantRoleRequest | AuthUserGrantRoleResponse | UserGrant grants a role to a specified user. |
| UserRevokeRole | AuthUserRevokeRoleRequest | AuthUserRevokeRoleResponse | UserRevokeRole revokes a role of specified user. |
| RoleAdd | AuthRoleAddRequest | AuthRoleAddResponse | RoleAdd adds a new role. |
| RoleAdd | AuthRoleAddRequest | AuthRoleAddResponse | RoleAdd adds a new role. Role name cannot be empty. |
| RoleGet | AuthRoleGetRequest | AuthRoleGetResponse | RoleGet gets detailed role information. |
| RoleList | AuthRoleListRequest | AuthRoleListResponse | RoleList gets lists of all roles. |
| RoleDelete | AuthRoleDeleteRequest | AuthRoleDeleteResponse | RoleDelete deletes a specified role. |
@ -35,6 +35,7 @@ This is a generated documentation. Please read the proto files for more.
| MemberRemove | MemberRemoveRequest | MemberRemoveResponse | MemberRemove removes an existing member from the cluster. |
| MemberUpdate | MemberUpdateRequest | MemberUpdateResponse | MemberUpdate updates the member configuration. |
| MemberList | MemberListRequest | MemberListResponse | MemberList lists all the members in the cluster. |
| MemberPromote | MemberPromoteRequest | MemberPromoteResponse | MemberPromote promotes a member from raft learner (non-voting) to raft voting member. |
@ -245,6 +246,7 @@ Empty field.
| ----- | ----------- | ---- |
| name | | string |
| password | | string |
| options | | authpb.UserAddOptions |
@ -607,6 +609,7 @@ Empty field.
| name | name is the human-readable name of the member. If the member is not started, the name will be an empty string. | string |
| peerURLs | peerURLs is the list of URLs the member exposes to the cluster for communication. | (slice of) string |
| clientURLs | clientURLs is the list of URLs the member exposes to clients for communication. If the member is not started, clientURLs will be empty. | (slice of) string |
| isLearner | isLearner indicates if the member is raft learner. | bool |
@ -615,6 +618,7 @@ Empty field.
| Field | Description | Type |
| ----- | ----------- | ---- |
| peerURLs | peerURLs is the list of URLs the added member will use to communicate with the cluster. | (slice of) string |
| isLearner | isLearner indicates if the added member is raft learner. | bool |
@ -643,6 +647,23 @@ Empty field.
##### message `MemberPromoteRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| ID | ID is the member ID of the member to promote. | uint64 |
##### message `MemberPromoteResponse` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | ResponseHeader |
| members | members is a list of all members after promoting the member. | (slice of) Member |
##### message `MemberRemoveRequest` (etcdserver/etcdserverpb/rpc.proto)
| Field | Description | Type |
@ -817,6 +838,7 @@ Empty field.
| raftAppliedIndex | raftAppliedIndex is the current raft applied index of the responding member. | uint64 |
| errors | errors contains alarm/health information and status. | (slice of) string |
| dbSizeInUse | dbSizeInUse is the size of the backend database logically in use, in bytes, of the responding member. | int64 |
| isLearner | isLearner indicates if the member is raft learner. | bool |
@ -980,6 +1002,15 @@ User is a single entry in the bucket authUsers
| name | | bytes |
| password | | bytes |
| roles | | (slice of) string |
| options | | UserAddOptions |
##### message `UserAddOptions` (auth/authpb/auth.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| no_password | | bool |

View File

@ -34,7 +34,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthenticateResponse"
}
@ -61,7 +61,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthDisableResponse"
}
@ -88,7 +88,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthEnableResponse"
}
@ -101,7 +101,7 @@
"tags": [
"Auth"
],
"summary": "RoleAdd adds a new role.",
"summary": "RoleAdd adds a new role. Role name cannot be empty.",
"operationId": "RoleAdd",
"parameters": [
{
@ -115,7 +115,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthRoleAddResponse"
}
@ -142,7 +142,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthRoleDeleteResponse"
}
@ -169,7 +169,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthRoleGetResponse"
}
@ -196,7 +196,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthRoleGrantPermissionResponse"
}
@ -223,7 +223,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthRoleListResponse"
}
@ -250,7 +250,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthRoleRevokePermissionResponse"
}
@ -263,7 +263,7 @@
"tags": [
"Auth"
],
"summary": "UserAdd adds a new user.",
"summary": "UserAdd adds a new user. User name cannot be empty.",
"operationId": "UserAdd",
"parameters": [
{
@ -277,7 +277,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthUserAddResponse"
}
@ -304,7 +304,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthUserChangePasswordResponse"
}
@ -331,7 +331,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthUserDeleteResponse"
}
@ -358,7 +358,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthUserGetResponse"
}
@ -385,7 +385,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthUserGrantRoleResponse"
}
@ -412,7 +412,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthUserListResponse"
}
@ -439,7 +439,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAuthUserRevokeRoleResponse"
}
@ -466,7 +466,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbMemberAddResponse"
}
@ -493,7 +493,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbMemberListResponse"
}
@ -501,6 +501,33 @@
}
}
},
"/v3/cluster/member/promote": {
"post": {
"tags": [
"Cluster"
],
"summary": "MemberPromote promotes a member from raft learner (non-voting) to raft voting member.",
"operationId": "MemberPromote",
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/etcdserverpbMemberPromoteRequest"
}
}
],
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbMemberPromoteResponse"
}
}
}
}
},
"/v3/cluster/member/remove": {
"post": {
"tags": [
@ -520,7 +547,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbMemberRemoveResponse"
}
@ -547,7 +574,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbMemberUpdateResponse"
}
@ -574,7 +601,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbCompactionResponse"
}
@ -601,7 +628,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbDeleteRangeResponse"
}
@ -628,7 +655,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbLeaseLeasesResponse"
}
@ -655,7 +682,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbLeaseRevokeResponse"
}
@ -682,7 +709,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbLeaseTimeToLiveResponse"
}
@ -709,7 +736,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbPutResponse"
}
@ -736,7 +763,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbRangeResponse"
}
@ -763,7 +790,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbTxnResponse"
}
@ -790,7 +817,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbLeaseGrantResponse"
}
@ -807,7 +834,7 @@
"operationId": "LeaseKeepAlive",
"parameters": [
{
"description": "(streaming inputs)",
"description": " (streaming inputs)",
"name": "body",
"in": "body",
"required": true,
@ -818,9 +845,9 @@
],
"responses": {
"200": {
"description": "(streaming responses)",
"description": "A successful response.(streaming responses)",
"schema": {
"$ref": "#/definitions/etcdserverpbLeaseKeepAliveResponse"
"$ref": "#/x-stream-definitions/etcdserverpbLeaseKeepAliveResponse"
}
}
}
@ -845,7 +872,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbLeaseLeasesResponse"
}
@ -872,7 +899,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbLeaseRevokeResponse"
}
@ -899,7 +926,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbLeaseTimeToLiveResponse"
}
@ -926,7 +953,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbAlarmResponse"
}
@ -953,7 +980,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbDefragmentResponse"
}
@ -980,7 +1007,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbHashKVResponse"
}
@ -1007,9 +1034,9 @@
],
"responses": {
"200": {
"description": "(streaming responses)",
"description": "A successful response.(streaming responses)",
"schema": {
"$ref": "#/definitions/etcdserverpbSnapshotResponse"
"$ref": "#/x-stream-definitions/etcdserverpbSnapshotResponse"
}
}
}
@ -1034,7 +1061,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbStatusResponse"
}
@ -1061,7 +1088,7 @@
],
"responses": {
"200": {
"description": "(empty)",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/etcdserverpbMoveLeaderResponse"
}
@ -1078,7 +1105,7 @@
"operationId": "Watch",
"parameters": [
{
"description": "(streaming inputs)",
"description": " (streaming inputs)",
"name": "body",
"in": "body",
"required": true,
@ -1089,9 +1116,9 @@
],
"responses": {
"200": {
"description": "(streaming responses)",
"description": "A successful response.(streaming responses)",
"schema": {
"$ref": "#/definitions/etcdserverpbWatchResponse"
"$ref": "#/x-stream-definitions/etcdserverpbWatchResponse"
}
}
}
@ -1192,6 +1219,15 @@
"READWRITE"
]
},
"authpbUserAddOptions": {
"type": "object",
"properties": {
"no_password": {
"type": "boolean",
"format": "boolean"
}
}
},
"etcdserverpbAlarmMember": {
"type": "object",
"properties": {
@ -1393,6 +1429,9 @@
"name": {
"type": "string"
},
"options": {
"$ref": "#/definitions/authpbUserAddOptions"
},
"password": {
"type": "string"
}
@ -1882,6 +1921,11 @@
"type": "string"
}
},
"isLearner": {
"description": "isLearner indicates if the member is raft learner.",
"type": "boolean",
"format": "boolean"
},
"name": {
"description": "name is the human-readable name of the member. If the member is not started, the name will be an empty string.",
"type": "string"
@ -1898,6 +1942,11 @@
"etcdserverpbMemberAddRequest": {
"type": "object",
"properties": {
"isLearner": {
"description": "isLearner indicates if the added member is raft learner.",
"type": "boolean",
"format": "boolean"
},
"peerURLs": {
"description": "peerURLs is the list of URLs the added member will use to communicate with the cluster.",
"type": "array",
@ -1944,6 +1993,31 @@
}
}
},
"etcdserverpbMemberPromoteRequest": {
"type": "object",
"properties": {
"ID": {
"description": "ID is the member ID of the member to promote.",
"type": "string",
"format": "uint64"
}
}
},
"etcdserverpbMemberPromoteResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
},
"members": {
"description": "members is a list of all members after promoting the member.",
"type": "array",
"items": {
"$ref": "#/definitions/etcdserverpbMember"
}
}
}
},
"etcdserverpbMemberRemoveRequest": {
"type": "object",
"properties": {
@ -2266,6 +2340,11 @@
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
},
"isLearner": {
"description": "isLearner indicates if the member is raft learner.",
"type": "boolean",
"format": "boolean"
},
"leader": {
"description": "leader is the member ID which the responding member believes is the current leader.",
"type": "string",
@ -2508,6 +2587,43 @@
"format": "int64"
}
}
},
"protobufAny": {
"type": "object",
"properties": {
"type_url": {
"type": "string"
},
"value": {
"type": "string",
"format": "byte"
}
}
},
"runtimeStreamError": {
"type": "object",
"properties": {
"details": {
"type": "array",
"items": {
"$ref": "#/definitions/protobufAny"
}
},
"grpc_code": {
"type": "integer",
"format": "int32"
},
"http_code": {
"type": "integer",
"format": "int32"
},
"http_status": {
"type": "string"
},
"message": {
"type": "string"
}
}
}
},
"securityDefinitions": {
@ -2521,5 +2637,43 @@
{
"ApiKey": []
}
]
],
"x-stream-definitions": {
"etcdserverpbLeaseKeepAliveResponse": {
"properties": {
"error": {
"$ref": "#/definitions/runtimeStreamError"
},
"result": {
"$ref": "#/definitions/etcdserverpbLeaseKeepAliveResponse"
}
},
"title": "Stream result of etcdserverpbLeaseKeepAliveResponse",
"type": "object"
},
"etcdserverpbSnapshotResponse": {
"properties": {
"error": {
"$ref": "#/definitions/runtimeStreamError"
},
"result": {
"$ref": "#/definitions/etcdserverpbSnapshotResponse"
}
},
"title": "Stream result of etcdserverpbSnapshotResponse",
"type": "object"
},
"etcdserverpbWatchResponse": {
"properties": {
"error": {
"$ref": "#/definitions/runtimeStreamError"
},
"result": {
"$ref": "#/definitions/etcdserverpbWatchResponse"
}
},
"title": "Stream result of etcdserverpbWatchResponse",
"type": "object"
}
}
}

View File

@ -21,7 +21,7 @@
"operationId": "Campaign",
"responses": {
"200": {
"description": "",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbCampaignResponse"
}
@ -48,7 +48,7 @@
"operationId": "Leader",
"responses": {
"200": {
"description": "",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbLeaderResponse"
}
@ -75,9 +75,9 @@
"operationId": "Observe",
"responses": {
"200": {
"description": "(streaming responses)",
"description": "A successful response.(streaming responses)",
"schema": {
"$ref": "#/definitions/v3electionpbLeaderResponse"
"$ref": "#/x-stream-definitions/v3electionpbLeaderResponse"
}
}
},
@ -102,7 +102,7 @@
"operationId": "Proclaim",
"responses": {
"200": {
"description": "",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbProclaimResponse"
}
@ -129,7 +129,7 @@
"operationId": "Resign",
"responses": {
"200": {
"description": "",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbResignResponse"
}
@ -212,6 +212,43 @@
}
}
},
"protobufAny": {
"type": "object",
"properties": {
"type_url": {
"type": "string"
},
"value": {
"type": "string",
"format": "byte"
}
}
},
"runtimeStreamError": {
"type": "object",
"properties": {
"grpc_code": {
"type": "integer",
"format": "int32"
},
"http_code": {
"type": "integer",
"format": "int32"
},
"message": {
"type": "string"
},
"http_status": {
"type": "string"
},
"details": {
"type": "array",
"items": {
"$ref": "#/definitions/protobufAny"
}
}
}
},
"v3electionpbCampaignRequest": {
"type": "object",
"properties": {
@ -330,5 +367,19 @@
}
}
}
},
"x-stream-definitions": {
"v3electionpbLeaderResponse": {
"type": "object",
"properties": {
"result": {
"$ref": "#/definitions/v3electionpbLeaderResponse"
},
"error": {
"$ref": "#/definitions/runtimeStreamError"
}
},
"title": "Stream result of v3electionpbLeaderResponse"
}
}
}

View File

@ -21,7 +21,7 @@
"operationId": "Lock",
"responses": {
"200": {
"description": "",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3lockpbLockResponse"
}
@ -48,7 +48,7 @@
"operationId": "Unlock",
"responses": {
"200": {
"description": "",
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3lockpbUnlockResponse"
}

View File

@ -1,7 +1,9 @@
# Experimental APIs and features
---
title: Experimental APIs and features
---
For the most part, the etcd project is stable, but we are still moving fast! We believe in the release fast philosophy. We want to get early feedback on features still in development and stabilizing. Thus, there are, and will be more, experimental features and APIs. We plan to improve these features based on the early feedback from the community, or abandon them if there is little interest, in the next few releases. Please do not rely on any experimental features or APIs in production environment.
## The current experimental API/features are:
- [KV ordering](https://godoc.org/github.com/coreos/etcd/clientv3/ordering) wrapper. When an etcd client switches endpoints, responses to serializable reads may go backward in time if the new endpoint is lagging behind the rest of the cluster. The ordering wrapper caches the current cluster revision from response headers. If a response revision is less than the cached revision, the client selects another endpoint and reissues the read. Enable in grpcproxy with `--experimental-serializable-ordering`.
- [KV ordering](https://godoc.org/github.com/etcd-io/etcd/clientv3/ordering) wrapper. When an etcd client switches endpoints, responses to serializable reads may go backward in time if the new endpoint is lagging behind the rest of the cluster. The ordering wrapper caches the current cluster revision from response headers. If a response revision is less than the cached revision, the client selects another endpoint and reissues the read. Enable in grpcproxy with `--experimental-serializable-ordering`.

View File

@ -1,4 +1,6 @@
# gRPC naming and discovery
---
title: gRPC naming and discovery
---
etcd provides a gRPC resolver to support an alternative name system that fetches endpoints from etcd for discovering gRPC services. The underlying mechanism is based on watching updates to keys prefixed with the service name.
@ -8,8 +10,8 @@ The etcd client provides a gRPC resolver for resolving gRPC endpoints with an et
```go
import (
"github.com/coreos/etcd/clientv3"
etcdnaming "github.com/coreos/etcd/clientv3/naming"
"go.etcd.io/etcd/clientv3"
etcdnaming "go.etcd.io/etcd/clientv3/naming"
"google.golang.org/grpc"
)

View File

@ -1,8 +1,12 @@
# Interacting with etcd
---
title: Interacting with etcd
---
Users mostly interact with etcd by putting or getting the value of a key. This section describes how to do that by using etcdctl, a command line tool for interacting with etcd server. The concepts described here should apply to the gRPC APIs or client library APIs.
By default, etcdctl talks to the etcd server with the v2 API for backward compatibility. For etcdctl to speak to etcd using the v3 API, the API version must be set to version 3 via the `ETCDCTL_API` environment variable. However note that any key that was created using the v2 API will not be able to be queried via the v3 API. A v3 API ```etcdctl get``` of a v2 key will exit with 0 and no key data, this is the expected behaviour.
The API version used by etcdctl to speak to etcd may be set to version `2` or `3` via the `ETCDCTL_API` environment variable. By default, etcdctl on master (3.4) uses the v3 API and earlier versions (3.3 and earlier) default to the v2 API.
Note that any key that was created using the v2 API will not be able to be queried via the v3 API. A v3 API ```etcdctl get``` of a v2 key will exit with 0 and no key data, this is the expected behaviour.
```bash

View File

@ -1,4 +1,6 @@
# System limits
---
title: System limits
---
## Request size limit

View File

@ -1,4 +1,6 @@
# Set up a local cluster
---
title: Set up a local cluster
---
For testing and development deployments, the quickest and easiest way is to configure a local cluster. For a production deployment, refer to the [clustering][clustering] section.
@ -21,14 +23,7 @@ The running etcd member listens on `localhost:2379` for client requests.
Use `etcdctl` to interact with the running cluster:
1. Configure the environment to have `ETCDCTL_API=3` so `etcdctl` uses the etcd API version 3 instead of defaulting to version 2.
```
# use API version 3
$ export ETCDCTL_API=3
```
2. Store an example key-value pair in the cluster:
1. Store an example key-value pair in the cluster:
```
$ ./etcdctl put foo bar
@ -37,7 +32,7 @@ Use `etcdctl` to interact with the running cluster:
If OK is printed, storing key-value pair is successful.
3. Retrieve the value of `foo`:
2. Retrieve the value of `foo`:
```
$ ./etcdctl get foo
@ -70,14 +65,7 @@ A `Procfile` at the base of the etcd git repository is provided to easily config
Use `etcdctl` to interact with the running cluster:
1. Configure the environment to have `ETCDCTL_API=3` so `etcdctl` uses the etcd API version 3 instead of defaulting to version 2.
```
# use API version 3
$ export ETCDCTL_API=3
```
2. Print the list of members:
1. Print the list of members:
```
$ etcdctl --write-out=table --endpoints=localhost:2379 member list
@ -94,7 +82,7 @@ Use `etcdctl` to interact with the running cluster:
+------------------+---------+--------+------------------------+------------------------+
```
3. Store an example key-value pair in the cluster:
2. Store an example key-value pair in the cluster:
```
$ etcdctl put foo bar

View File

@ -1,4 +1,6 @@
# Discovery service protocol
---
title: Discovery service protocol
---
Discovery service protocol helps new etcd member to discover all other members in cluster bootstrap phase using a shared discovery URL.

View File

@ -1,4 +1,6 @@
# Logging conventions
---
title: Logging conventions
---
etcd uses the [capnslog][capnslog] library for logging application output categorized into *levels*. A log message's level is determined according to these conventions:

View File

@ -1,4 +1,6 @@
# etcd release guide
---
title: etcd release guide
---
The guide talks about how to release a new version of etcd.
@ -12,6 +14,7 @@ release and for ensuring the stability of the release branch.
| Releases | Manager |
| -------- | ------- |
| 3.1 patch (post 3.1.0) | Joe Betz [@jpbetz](https://github.com/jpbetz) |
| 3.2 patch (post 3.2.0) | Joe Betz [@jpbetz](https://github.com/jpbetz) |
| 3.3 patch (post 3.3.0) | Gyuho Lee [@gyuho](https://github.com/gyuho) |
@ -29,9 +32,9 @@ All releases version numbers follow the format of [semantic versioning 2.0.0](ht
### Major, minor version release, or its pre-release
- Ensure the relevant milestone on GitHub is complete. All referenced issues should be closed, or moved elsewhere.
- Remove this release from [roadmap](https://github.com/coreos/etcd/blob/master/ROADMAP.md), if necessary.
- Remove this release from [roadmap](https://github.com/etcd-io/etcd/blob/master/ROADMAP.md), if necessary.
- Ensure the latest upgrade documentation is available.
- Bump [hardcoded MinClusterVerion in the repository](https://github.com/coreos/etcd/blob/master/version/version.go#L29), if necessary.
- Bump [hardcoded MinClusterVerion in the repository](https://github.com/etcd-io/etcd/blob/master/version/version.go#L29), if necessary.
- Add feature capability maps for the new version, if necessary.
### Patch version release
@ -49,18 +52,17 @@ All releases version numbers follow the format of [semantic versioning 2.0.0](ht
## Tag version
- Bump [hardcoded Version in the repository](https://github.com/coreos/etcd/blob/master/version/version.go#L30) to the latest version `${VERSION}`.
- Bump [hardcoded Version in the repository](https://github.com/etcd-io/etcd/blob/master/version/version.go#L30) to the latest version `${VERSION}`.
- Ensure all tests on CI system are passed.
- Manually check etcd is buildable in Linux, Darwin and Windows.
- Manually check upgrade etcd cluster of previous minor version works well.
- Manually check new features work well.
- Add a signed tag through `git tag -s ${VERSION}`.
- Sanity check tag correctness through `git show tags/$VERSION`.
- Push the tag to GitHub through `git push origin tags/$VERSION`. This assumes `origin` corresponds to "https://github.com/coreos/etcd".
- Push the tag to GitHub through `git push origin tags/$VERSION`. This assumes `origin` corresponds to "https://github.com/etcd-io/etcd".
## Build release binaries and images
- Ensure `acbuild` is available.
- Ensure `docker` is available.
Run release script in root directory:
@ -83,11 +85,11 @@ for i in etcd-*{.zip,.tar.gz}; do gpg2 --default-key $SUBKEYID --armor --output
for i in etcd-*{.zip,.tar.gz}; do gpg2 --verify ${i}.asc ${i}; done
# sign zipped source code files
wget https://github.com/coreos/etcd/archive/${VERSION}.zip
wget https://github.com/etcd-io/etcd/archive/${VERSION}.zip
gpg2 --armor --default-key $SUBKEYID --output ${VERSION}.zip.asc --detach-sign ${VERSION}.zip
gpg2 --verify ${VERSION}.zip.asc ${VERSION}.zip
wget https://github.com/coreos/etcd/archive/${VERSION}.tar.gz
wget https://github.com/etcd-io/etcd/archive/${VERSION}.tar.gz
gpg2 --armor --default-key $SUBKEYID --output ${VERSION}.tar.gz.asc --detach-sign ${VERSION}.tar.gz
gpg2 --verify ${VERSION}.tar.gz.asc ${VERSION}.tar.gz
```
@ -155,5 +157,5 @@ git log ...${PREV_VERSION} --pretty=format:"%an" | sort | uniq | tr '\n' ',' | s
## Post release
- Create new stable branch through `git push origin ${VERSION_MAJOR}.${VERSION_MINOR}` if this is a major stable release. This assumes `origin` corresponds to "https://github.com/coreos/etcd".
- Bump [hardcoded Version in the repository](https://github.com/coreos/etcd/blob/master/version/version.go#L30) to the version `${VERSION}+git`.
- Create new stable branch through `git push origin ${VERSION_MAJOR}.${VERSION_MINOR}` if this is a major stable release. This assumes `origin` corresponds to "https://github.com/etcd-io/etcd".
- Bump [hardcoded Version in the repository](https://github.com/etcd-io/etcd/blob/master/version/version.go#L30) to the version `${VERSION}+git`.

View File

@ -1,4 +1,6 @@
# Download and build
---
title: Download and build
---
## System requirements
@ -15,7 +17,7 @@ For those wanting to try the very latest version, build etcd from the `master` b
To build `etcd` from the `master` branch without a `GOPATH` using the official `build` script:
```sh
$ git clone https://github.com/coreos/etcd.git
$ git clone https://github.com/etcd-io/etcd.git
$ cd etcd
$ ./build
```
@ -26,7 +28,8 @@ To build a vendored `etcd` from the `master` branch via `go get`:
# GOPATH should be set
$ echo $GOPATH
/Users/example/go
$ go get -v github.com/coreos/etcd
$ go get -v go.etcd.io/etcd
$ go get -v go.etcd.io/etcd/etcdctl
```
## Test the installation
@ -35,14 +38,14 @@ Check the etcd binary is built correctly by starting etcd and setting a key.
### Starting etcd
If etcd is built without using GOPATH, run the following:
If etcd is built without using `go get`, run the following:
```
```sh
$ ./bin/etcd
```
If etcd is built using GOPATH, run the following:
If etcd is built using `go get`, run the following:
```
```sh
$ $GOPATH/bin/etcd
```
@ -50,14 +53,16 @@ $ $GOPATH/bin/etcd
Run the following:
```
$ ETCDCTL_API=3 ./bin/etcdctl put foo bar
```sh
$ ./bin/etcdctl put foo bar
OK
```
(or `$GOPATH/bin/etcdctl put foo bar` if etcdctl was installed with `go get`)
If OK is printed, then etcd is working!
[github-release]: https://github.com/coreos/etcd/releases/
[github-release]: https://github.com/etcd-io/etcd/releases/
[go]: https://golang.org/doc/install
[build-script]: ../build
[cmd-directory]: ../cmd

View File

@ -68,8 +68,10 @@ To learn more about the concepts and internals behind etcd, read the following p
- [Understand data model][data_model]
- [Understand APIs][understand_apis]
- [Glossary][glossary]
- Internals
- [Auth subsystem][auth_design]
- Design
- [Auth subsystem][design-auth-v3]
- [Client][design-client]
- [Learner][design-learner]
[api_ref]: dev-guide/api_reference_v3.md
[api_concurrency_ref]: dev-guide/api_concurrency_reference_v3.md
@ -82,12 +84,12 @@ To learn more about the concepts and internals behind etcd, read the following p
[data_model]: learning/data_model.md
[demo]: demo.md
[download_build]: dl_build.md
[embed_etcd]: https://godoc.org/github.com/coreos/etcd/embed
[embed_etcd]: https://godoc.org/github.com/etcd-io/etcd/embed
[grpc_naming]: dev-guide/grpc_naming.md
[failures]: op-guide/failures.md
[gateway]: op-guide/gateway.md
[glossary]: learning/glossary.md
[namespace_client]: https://godoc.org/github.com/coreos/etcd/clientv3/namespace
[namespace_client]: https://godoc.org/github.com/etcd-io/etcd/clientv3/namespace
[namespace_proxy]: op-guide/grpc_proxy.md#namespacing
[grpc_proxy]: op-guide/grpc_proxy.md
[hardware]: op-guide/hardware.md
@ -109,6 +111,8 @@ To learn more about the concepts and internals behind etcd, read the following p
[aws_platform]: platforms/aws.md
[experimental]: dev-guide/experimental_apis.md
[authentication]: op-guide/authentication.md
[auth_design]: learning/auth_design.md
[design-auth-v3]: learning/design-auth-v3.md
[design-client]: learning/design-client.md
[design-learner]: learning/design-learner.md
[tuning]: tuning.md
[upgrading]: upgrades/upgrading-etcd.md

View File

@ -9,3 +9,17 @@ Instructions for use are the same as the [kubernetes-mixin](https://github.com/k
## Background
* For more information about monitoring mixins, see this [design doc](https://docs.google.com/document/d/1A9xvzwqnFVSOZ5fD3blKODXfsat5fg6ZhnKu9LK3lB4/edit#).
## Testing alerts
Make sure to have [jsonnet](https://jsonnet.org/) and [gojsontoyaml](https://github.com/brancz/gojsontoyaml) installed.
First compile the mixin to a YAML file, which the promtool will read:
```
jsonnet -e '(import "mixin.libsonnet").prometheusAlerts' | gojsontoyaml > mixin.yaml
```
Then run the unit test:
```
promtool test rules test.yaml
```

View File

@ -9,20 +9,40 @@
name: 'etcd',
rules: [
{
alert: 'EtcdInsufficientMembers',
alert: 'etcdMembersDown',
expr: |||
count(up{%(etcd_selector)s} == 0) by (job) > (count(up{%(etcd_selector)s}) by (job) / 2 - 1)
max by (job) (
sum by (job) (up{%(etcd_selector)s} == bool 0)
or
count by (job,endpoint) (
sum by (job,endpoint,To) (rate(etcd_network_peer_sent_failures_total{%(etcd_selector)s}[3m])) > 0.01
)
)
> 0
||| % $._config,
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
message: 'Etcd cluster "{{ $labels.job }}": insufficient members ({{ $value }}).',
message: 'etcd cluster "{{ $labels.job }}": members are down ({{ $value }}).',
},
},
{
alert: 'EtcdNoLeader',
alert: 'etcdInsufficientMembers',
expr: |||
sum(up{%(etcd_selector)s} == bool 1) by (job) < ((count(up{%(etcd_selector)s}) by (job) + 1) / 2)
||| % $._config,
'for': '3m',
labels: {
severity: 'critical',
},
annotations: {
message: 'etcd cluster "{{ $labels.job }}": insufficient members ({{ $value }}).',
},
},
{
alert: 'etcdNoLeader',
expr: |||
etcd_server_has_leader{%(etcd_selector)s} == 0
||| % $._config,
@ -31,11 +51,11 @@
severity: 'critical',
},
annotations: {
message: 'Etcd cluster "{{ $labels.job }}": member {{ $labels.instance }} has no leader.',
message: 'etcd cluster "{{ $labels.job }}": member {{ $labels.instance }} has no leader.',
},
},
{
alert: 'EtcdHighNumberOfLeaderChanges',
alert: 'etcdHighNumberOfLeaderChanges',
expr: |||
rate(etcd_server_leader_changes_seen_total{%(etcd_selector)s}[15m]) > 3
||| % $._config,
@ -44,11 +64,11 @@
severity: 'warning',
},
annotations: {
message: 'Etcd cluster "{{ $labels.job }}": instance {{ $labels.instance }} has seen {{ $value }} leader changes within the last hour.',
message: 'etcd cluster "{{ $labels.job }}": instance {{ $labels.instance }} has seen {{ $value }} leader changes within the last 30 minutes.',
},
},
{
alert: 'EtcdHighNumberOfFailedGRPCRequests',
alert: 'etcdHighNumberOfFailedGRPCRequests',
expr: |||
100 * sum(rate(grpc_server_handled_total{%(etcd_selector)s, grpc_code!="OK"}[5m])) BY (job, instance, grpc_service, grpc_method)
/
@ -60,11 +80,11 @@
severity: 'warning',
},
annotations: {
message: 'Etcd cluster "{{ $labels.job }}": {{ $value }}% of requests for {{ $labels.grpc_method }} failed on etcd instance {{ $labels.instance }}.',
message: 'etcd cluster "{{ $labels.job }}": {{ $value }}% of requests for {{ $labels.grpc_method }} failed on etcd instance {{ $labels.instance }}.',
},
},
{
alert: 'EtcdHighNumberOfFailedGRPCRequests',
alert: 'etcdHighNumberOfFailedGRPCRequests',
expr: |||
100 * sum(rate(grpc_server_handled_total{%(etcd_selector)s, grpc_code!="OK"}[5m])) BY (job, instance, grpc_service, grpc_method)
/
@ -76,11 +96,11 @@
severity: 'critical',
},
annotations: {
message: 'Etcd cluster "{{ $labels.job }}": {{ $value }}% of requests for {{ $labels.grpc_method }} failed on etcd instance {{ $labels.instance }}.',
message: 'etcd cluster "{{ $labels.job }}": {{ $value }}% of requests for {{ $labels.grpc_method }} failed on etcd instance {{ $labels.instance }}.',
},
},
{
alert: 'EtcdGRPCRequestsSlow',
alert: 'etcdGRPCRequestsSlow',
expr: |||
histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket{%(etcd_selector)s, grpc_type="unary"}[5m])) by (job, instance, grpc_service, grpc_method, le))
> 0.15
@ -90,11 +110,11 @@
severity: 'critical',
},
annotations: {
message: 'Etcd cluster "{{ $labels.job }}": gRPC requests to {{ $labels.grpc_method }} are taking {{ $value }}s on etcd instance {{ $labels.instance }}.',
message: 'etcd cluster "{{ $labels.job }}": gRPC requests to {{ $labels.grpc_method }} are taking {{ $value }}s on etcd instance {{ $labels.instance }}.',
},
},
{
alert: 'EtcdMemberCommunicationSlow',
alert: 'etcdMemberCommunicationSlow',
expr: |||
histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket{%(etcd_selector)s}[5m]))
> 0.15
@ -104,11 +124,11 @@
severity: 'warning',
},
annotations: {
message: 'Etcd cluster "{{ $labels.job }}": member communication with {{ $labels.To }} is taking {{ $value }}s on etcd instance {{ $labels.instance }}.',
message: 'etcd cluster "{{ $labels.job }}": member communication with {{ $labels.To }} is taking {{ $value }}s on etcd instance {{ $labels.instance }}.',
},
},
{
alert: 'EtcdHighNumberOfFailedProposals',
alert: 'etcdHighNumberOfFailedProposals',
expr: |||
rate(etcd_server_proposals_failed_total{%(etcd_selector)s}[15m]) > 5
||| % $._config,
@ -117,11 +137,11 @@
severity: 'warning',
},
annotations: {
message: 'Etcd cluster "{{ $labels.job }}": {{ $value }} proposal failures within the last hour on etcd instance {{ $labels.instance }}.',
message: 'etcd cluster "{{ $labels.job }}": {{ $value }} proposal failures within the last 30 minutes on etcd instance {{ $labels.instance }}.',
},
},
{
alert: 'EtcdHighFsyncDurations',
alert: 'etcdHighFsyncDurations',
expr: |||
histogram_quantile(0.99, rate(etcd_disk_wal_fsync_duration_seconds_bucket{%(etcd_selector)s}[5m]))
> 0.5
@ -131,11 +151,11 @@
severity: 'warning',
},
annotations: {
message: 'Etcd cluster "{{ $labels.job }}": 99th percentile fync durations are {{ $value }}s on etcd instance {{ $labels.instance }}.',
message: 'etcd cluster "{{ $labels.job }}": 99th percentile fync durations are {{ $value }}s on etcd instance {{ $labels.instance }}.',
},
},
{
alert: 'EtcdHighCommitDurations',
alert: 'etcdHighCommitDurations',
expr: |||
histogram_quantile(0.99, rate(etcd_disk_backend_commit_duration_seconds_bucket{%(etcd_selector)s}[5m]))
> 0.25
@ -145,37 +165,49 @@
severity: 'warning',
},
annotations: {
message: 'Etcd cluster "{{ $labels.job }}": 99th percentile commit durations {{ $value }}s on etcd instance {{ $labels.instance }}.',
message: 'etcd cluster "{{ $labels.job }}": 99th percentile commit durations {{ $value }}s on etcd instance {{ $labels.instance }}.',
},
},
{
record: 'instance:fd_utilization',
expr: 'process_open_fds / process_max_fds',
},
{
alert: 'FdExhaustionClose',
alert: 'etcdHighNumberOfFailedHTTPRequests',
expr: |||
predict_linear(instance:fd_utilization{%(etcd_selector)s}[1h], 3600 * 4) > 1
sum(rate(etcd_http_failed_total{%(etcd_selector)s, code!="404"}[5m])) BY (method) / sum(rate(etcd_http_received_total{%(etcd_selector)s}[5m]))
BY (method) > 0.01
||| % $._config,
'for': '10m',
labels: {
severity: 'warning',
},
annotations: {
message: '{{ $labels.job }} instance {{ $labels.instance }} will exhaust its file descriptors soon',
message: '{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}',
},
},
{
alert: 'FdExhaustionClose',
alert: 'etcdHighNumberOfFailedHTTPRequests',
expr: |||
predict_linear(instance:fd_utilization{%(etcd_selector)s}[10m], 3600) > 1
sum(rate(etcd_http_failed_total{%(etcd_selector)s, code!="404"}[5m])) BY (method) / sum(rate(etcd_http_received_total{%(etcd_selector)s}[5m]))
BY (method) > 0.05
||| % $._config,
'for': '10m',
labels: {
severity: 'critical',
},
annotations: {
description: '{{ $labels.job }} instance {{ $labels.instance }} will exhaust its file descriptors soon',
message: '{{ $value }}% of requests for {{ $labels.method }} failed on etcd instance {{ $labels.instance }}.',
},
},
{
alert: 'etcdHTTPRequestsSlow',
expr: |||
histogram_quantile(0.99, rate(etcd_http_successful_duration_seconds_bucket[5m]))
> 0.15
||| % $._config,
'for': '10m',
labels: {
severity: 'warning',
},
annotations: {
message: 'etcd instance {{ $labels.instance }} HTTP requests to {{ $labels.method }} are slow.',
},
},
],
@ -1251,7 +1283,7 @@
annotations: {
list: [],
},
refresh: false,
refresh: '10s',
schemaVersion: 13,
version: 215,
links: [],

View File

@ -0,0 +1,85 @@
rule_files:
- mixin.yaml
evaluation_interval: 1m
tests:
- interval: 1m
input_series:
- series: 'up{job="etcd",instance="10.10.10.0"}'
values: '1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.1"}'
values: '1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.2"}'
values: '1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0'
alert_rule_test:
- eval_time: 3m
alertname: etcdInsufficientMembers
- eval_time: 5m
alertname: etcdInsufficientMembers
- eval_time: 5m
alertname: etcdMembersDown
- eval_time: 7m
alertname: etcdMembersDown
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'etcd cluster "etcd": members are down (1).'
- eval_time: 7m
alertname: etcdInsufficientMembers
- eval_time: 11m
alertname: etcdInsufficientMembers
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'etcd cluster "etcd": insufficient members (1).'
- eval_time: 15m
alertname: etcdInsufficientMembers
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'etcd cluster "etcd": insufficient members (0).'
- interval: 1m
input_series:
- series: 'up{job="etcd",instance="10.10.10.0"}'
values: '1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.1"}'
values: '1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.2"}'
values: '1 1 1 1 0 0 0 0'
alert_rule_test:
- eval_time: 10m
alertname: etcdMembersDown
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'etcd cluster "etcd": members are down (2).'
- interval: 1m
input_series:
- series: 'up{job="etcd",instance="10.10.10.0"}'
values: '1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.1"}'
values: '1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0'
- series: 'etcd_network_peer_sent_failures_total{To="member-1",job="etcd",endpoint="test"}'
values: '0 0 1 2 3 4 5 6 7 8 9 10'
alert_rule_test:
- eval_time: 4m
alertname: etcdMembersDown
- eval_time: 6m
alertname: etcdMembersDown
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'etcd cluster "etcd": members are down (1).'

View File

@ -1,4 +1,6 @@
# Frequently Asked Questions (FAQ)
---
title: Frequently Asked Questions (FAQ)
---
## etcd, general
@ -22,7 +24,7 @@ A member's advertised peer URLs come from `--initial-advertise-peer-urls` on ini
### System requirements
Since etcd writes data to disk, SSD is highly recommended. To prevent performance degradation or unintentionally overloading the key-value store, etcd enforces a configurable storage size quota set to 2GB by default. To avoid swapping or running out of memory, the machine should have at least as much RAM to cover the quota. 8GB is a suggested maximum size for normal environments and etcd warns at startup if the configured value exceeds it. At CoreOS, an etcd cluster is usually deployed on dedicated CoreOS Container Linux machines with dual-core processors, 2GB of RAM, and 80GB of SSD *at the very least*. **Note that performance is intrinsically workload dependent; please test before production deployment**. See [hardware][hardware-setup] for more recommendations.
Since etcd writes data to disk, its performance strongly depends on disk performance. For this reason, SSD is highly recommended. To assess whether a disk is fast enough for etcd, one possibility is using a disk benchmarking tool such as [fio][fio]. For an example on how to do that, read [here][fio-blog-post]. To prevent performance degradation or unintentionally overloading the key-value store, etcd enforces a configurable storage size quota set to 2GB by default. To avoid swapping or running out of memory, the machine should have at least as much RAM to cover the quota. 8GB is a suggested maximum size for normal environments and etcd warns at startup if the configured value exceeds it. At CoreOS, an etcd cluster is usually deployed on dedicated CoreOS Container Linux machines with dual-core processors, 2GB of RAM, and 80GB of SSD *at the very least*. **Note that performance is intrinsically workload dependent; please test before production deployment**. See [hardware][hardware-setup] for more recommendations.
Most stable production environment is Linux operating system with amd64 architecture; see [supported platform][supported-platform] for more.
@ -70,7 +72,7 @@ etcdctl provides a `snapshot` command to create backups. See [backup][backup] fo
When replacing an etcd node, it's important to remove the member first and then add its replacement.
etcd employs distributed consensus based on a quorum model; (n+1)/2 members, a majority, must agree on a proposal before it can be committed to the cluster. These proposals include key-value updates and membership changes. This model totally avoids any possibility of split brain inconsistency. The downside is permanent quorum loss is catastrophic.
etcd employs distributed consensus based on a quorum model; (n/2)+1 members, a majority, must agree on a proposal before it can be committed to the cluster. These proposals include key-value updates and membership changes. This model totally avoids any possibility of split brain inconsistency. The downside is permanent quorum loss is catastrophic.
How this applies to membership: If a 3-member cluster has 1 downed member, it can still make forward progress because the quorum is 2 and 2 members are still live. However, adding a new member to a 3-member cluster will increase the quorum to 3 because 3 votes are required for a majority of 4 members. Since the quorum increased, this extra member buys nothing in terms of fault tolerance; the cluster is still one node failure away from being unrecoverable.
@ -106,7 +108,7 @@ To recover from the low space quota alarm:
This is gRPC-side warning when a server receives a TCP RST flag with client-side streams being prematurely closed. For example, a client closes its connection, while gRPC server has not yet processed all HTTP/2 frames in the TCP queue. Some data may have been lost in server side, but it is ok so long as client connection has already been closed.
Only [old versions of gRPC](https://github.com/grpc/grpc-go/issues/1362) log this. etcd [>=v3.2.13 by default log this with DEBUG level](https://github.com/coreos/etcd/pull/9080), thus only visible with `--debug` flag enabled.
Only [old versions of gRPC](https://github.com/grpc/grpc-go/issues/1362) log this. etcd [>=v3.2.13 by default log this with DEBUG level](https://github.com/etcd-io/etcd/pull/9080), thus only visible with `--debug` flag enabled.
## Performance
@ -130,7 +132,7 @@ If none of the above suggestions clear the warnings, please [open an issue][new_
etcd uses a leader-based consensus protocol for consistent data replication and log execution. Cluster members elect a single leader, all other members become followers. The elected leader must periodically send heartbeats to its followers to maintain its leadership. Followers infer leader failure if no heartbeats are received within an election interval and trigger an election. If a leader doesnt send its heartbeats in time but is still running, the election is spurious and likely caused by insufficient resources. To catch these soft failures, if the leader skips two heartbeat intervals, etcd will warn it failed to send a heartbeat on time.
Usually this issue is caused by a slow disk. Before the leader sends heartbeats attached with metadata, it may need to persist the metadata to disk. The disk could be experiencing contention among etcd and other applications, or the disk is too simply slow (e.g., a shared virtualized disk). To rule out a slow disk from causing this warning, monitor [wal_fsync_duration_seconds][wal_fsync_duration_seconds] (p99 duration should be less than 10ms) to confirm the disk is reasonably fast. If the disk is too slow, assigning a dedicated disk to etcd or using faster disk will typically solve the problem.
Usually this issue is caused by a slow disk. Before the leader sends heartbeats attached with metadata, it may need to persist the metadata to disk. The disk could be experiencing contention among etcd and other applications, or the disk is too simply slow (e.g., a shared virtualized disk). To rule out a slow disk from causing this warning, monitor [wal_fsync_duration_seconds][wal_fsync_duration_seconds] (p99 duration should be less than 10ms) to confirm the disk is reasonably fast. If the disk is too slow, assigning a dedicated disk to etcd or using faster disk will typically solve the problem. To tell whether a disk is fast enough for etcd, a benchmarking tool such as [fio][fio] can be used. Read [here][fio-blog-post] for an example.
The second most common cause is CPU starvation. If monitoring of the machines CPU usage shows heavy utilization, there may not be enough compute capacity for etcd. Moving etcd to dedicated machine, increasing process resource isolation with cgroups, or renicing the etcd server process into a higher priority can usually solve the problem.
@ -147,15 +149,17 @@ etcd sends a snapshot of its complete key-value store to refresh slow followers
[supported-platform]: ./op-guide/supported-platform.md
[wal_fsync_duration_seconds]: ./metrics.md#disk
[tuning]: ./tuning.md
[new_issue]: https://github.com/coreos/etcd/issues/new
[new_issue]: https://github.com/etcd-io/etcd/issues/new
[backend_commit_metrics]: ./metrics.md#disk
[raft]: https://raft.github.io/raft.pdf
[backup]: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/recovery.md#snapshotting-the-keyspace
[backup]: https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/recovery.md#snapshotting-the-keyspace
[chubby]: http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf
[runtime reconfiguration]: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/runtime-configuration.md
[runtime reconfiguration]: https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/runtime-configuration.md
[benchmark]: https://github.com/coreos/etcd/tree/master/tools/benchmark
[benchmark-result]: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/performance.md
[benchmark-result]: https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/performance.md
[api-mvcc]: learning/api.md#revisions
[maintenance-compact]: op-guide/maintenance.md#history-compaction
[maintenance-defragment]: op-guide/maintenance.md#defragmentation
[maintenance-disarm]: ../etcdctl/README.md#alarm-disarm
[fio]: https://github.com/axboe/fio
[fio-blog-post]: https://www.ibm.com/blogs/bluemix/2019/04/using-fio-to-tell-whether-your-storage-is-fast-enough-for-etcd/

View File

@ -1,8 +1,10 @@
# Libraries and tools
---
title: Libraries and tools
---
**Tools**
- [etcdctl](https://github.com/coreos/etcd/tree/master/etcdctl) - A command line client for etcd
- [etcdctl](https://github.com/etcd-io/etcd/tree/master/etcdctl) - A command line client for etcd
- [etcd-backup](https://github.com/fanhattan/etcd-backup) - A powerful command line utility for dumping/restoring etcd - Supports v2
- [etcd-dump](https://npmjs.org/package/etcd-dump) - Command line utility for dumping/restoring etcd.
- [etcd-fs](https://github.com/xetorthio/etcd-fs) - FUSE filesystem for etcd
@ -19,14 +21,14 @@
**Go libraries**
- [etcd/clientv3](https://github.com/coreos/etcd/blob/master/clientv3) - the officially maintained Go client for v3
- [etcd/client](https://github.com/coreos/etcd/blob/master/client) - the officially maintained Go client for v2
- [etcd/clientv3](https://github.com/etcd-io/etcd/blob/master/clientv3) - the officially maintained Go client for v3
- [etcd/client](https://github.com/etcd-io/etcd/blob/master/client) - the officially maintained Go client for v2
- [go-etcd](https://github.com/coreos/go-etcd) - the deprecated official client. May be useful for older (<2.0.0) versions of etcd.
- [encWrapper](https://github.com/lumjjb/etcd/tree/enc_wrapper/clientwrap/encwrapper) - encWrapper is an encryption wrapper for the etcd client Keys API/KV.
**Java libraries**
- [coreos/jetcd](https://github.com/coreos/jetcd) - Supports v3
- [coreos/jetcd](https://github.com/etcd-io/jetcd) - Supports v3
- [boonproject/etcd](https://github.com/boonproject/boon/blob/master/etcd/README.md) - Supports v2, Async/Sync and waits
- [justinsb/jetcd](https://github.com/justinsb/jetcd)
- [diwakergupta/jetcd](https://github.com/diwakergupta/jetcd) - Supports v2
@ -54,6 +56,7 @@
- [txaio-etcd](https://github.com/crossbario/txaio-etcd) - Asynchronous etcd v3-only client library for Twisted (today) and asyncio (future)
- [dims/etcd3-gateway](https://github.com/dims/etcd3-gateway) - etcd v3 API library using the HTTP grpc gateway
- [aioetcd3](https://github.com/gaopeiliang/aioetcd3) - (Python 3.6+) etcd v3 API for asyncio
- [Revolution1/etcd3-py](https://github.com/Revolution1/etcd3-py) - (python2.7 and python3.5+) Python client for etcd v3, using gRPC-JSON-Gateway
**Node libraries**
@ -89,7 +92,8 @@
**Erlang libraries**
- [marshall-lee/etcd.erl](https://github.com/marshall-lee/etcd.erl)
- [marshall-lee/etcd.erl](https://github.com/marshall-lee/etcd.erl) - Supports v2
- [zhongwencool/eetcd](https://github.com/zhongwencool/eetcd) - Supports v3+ (GRPC only)
**.Net Libraries**
@ -167,4 +171,9 @@
- [Vitess](http://vitess.io/) - Vitess is a database clustering system for horizontal scaling of MySQL.
- [lclarkmichalek/etcdhcp](https://github.com/lclarkmichalek/etcdhcp) - DHCP server that uses etcd for persistence and coordination.
- [openstack/networking-vpp](https://github.com/openstack/networking-vpp) - A networking driver that programs the [FD.io VPP dataplane](https://wiki.fd.io/view/VPP) to provide [OpenStack](https://www.openstack.org/) cloud virtual networking
- [openstack](https://github.com/openstack/governance/blob/master/reference/base-services.rst) - OpenStack services can rely on etcd as a base service.
- [OpenStack](https://github.com/openstack/governance/blob/master/reference/base-services.rst) - OpenStack services can rely on etcd as a base service.
- [CoreDNS](https://github.com/coredns/coredns/tree/master/plugin/etcd) - CoreDNS is a DNS server that chains plugins, part of CNCF and Kubernetes
- [Uber M3](https://github.com/m3db/m3) - M3: Ubers Open Source, Large-scale Metrics Platform for Prometheus
- [Rook](https://github.com/rook/rook) - Storage Orchestration for Kubernetes
- [Patroni](https://github.com/zalando/patroni) - A template for PostgreSQL High Availability with ZooKeeper, etcd, or Consul
- [Trillian](https://github.com/google/trillian) - Trillian implements a Merkle tree whose contents are served from a data storage layer, to allow scalability to extremely large trees.

View File

@ -1,4 +1,6 @@
# etcd3 API
---
title: etcd3 API
---
This document is meant to give an overview of the etcd3 API's central design. It is by no means all encompassing, but intended to focus on the basic ideas needed to understand etcd without the distraction of less common API calls. All etcd3 API's are defined in [gRPC services][grpc-service], which categorize remote procedure calls (RPCs) understood by the etcd server. A full listing of all etcd RPCs are documented in markdown in the [gRPC API listing][grpc-api].
@ -472,10 +474,10 @@ message LeaseKeepAliveResponse {
* ID - the lease that was refreshed with a new TTL.
* TTL - the new time-to-live, in seconds, that the lease has remaining.
[elections]: https://github.com/coreos/etcd/blob/master/clientv3/concurrency/election.go
[kv-proto]: https://github.com/coreos/etcd/blob/master/mvcc/mvccpb/kv.proto
[elections]: https://github.com/etcd-io/etcd/blob/master/clientv3/concurrency/election.go
[kv-proto]: https://github.com/etcd-io/etcd/blob/master/mvcc/mvccpb/kv.proto
[grpc-api]: ../dev-guide/api_reference_v3.md
[grpc-service]: https://github.com/coreos/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto
[locks]: https://github.com/coreos/etcd/blob/master/clientv3/concurrency/mutex.go
[grpc-service]: https://github.com/etcd-io/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto
[locks]: https://github.com/etcd-io/etcd/blob/master/clientv3/concurrency/mutex.go
[mvcc]: https://en.wikipedia.org/wiki/Multiversion_concurrency_control
[stm]: https://github.com/coreos/etcd/blob/master/clientv3/concurrency/stm.go
[stm]: https://github.com/etcd-io/etcd/blob/master/clientv3/concurrency/stm.go

View File

@ -1,4 +1,6 @@
# KV API guarantees
---
title: KV API guarantees
---
etcd is a consistent and durable key value store with [mini-transaction][txn] support. The key value store is exposed through the KV APIs. etcd tries to ensure the strongest consistency and durability guarantees for a distributed system. This specification enumerates the KV API guarantees made by etcd.

View File

@ -1,4 +1,6 @@
# Data model
---
title: Data model
---
etcd is designed to reliably store infrequently updated data and provide reliable watch queries. etcd exposes previous versions of key-value pairs to support inexpensive snapshots and watch history events (“time travel queries”). A persistent, multi-version, concurrency-control data model is a good fit for these use cases.

View File

@ -1,4 +1,6 @@
# etcd v3 authentication design
---
title: etcd v3 authentication design
---
## Why not reuse the v2 auth system?
@ -26,7 +28,7 @@ The metadata for auth should also be stored and managed in the storage controlle
The authentication mechanism in the etcd v2 protocol has a tricky part because the metadata consistency should work as in the above, but does not: each permission check is processed by the etcd member that receives the client request (etcdserver/api/v2http/client.go), including follower members. Therefore, it's possible the check may be based on stale metadata.
This staleness means that auth configuration cannot be reflected as soon as operators execute etcdctl. Therefore there is no way to know how long the stale metadata is active. Practically, the configuration change is reflected immediately after the command execution. However, in some cases of heavy load, the inconsistent state can be prolonged and it might result in counter-intuitive situations for users and developers. It requires a workaround like this: https://github.com/coreos/etcd/pull/4317#issuecomment-179037582
This staleness means that auth configuration cannot be reflected as soon as operators execute etcdctl. Therefore there is no way to know how long the stale metadata is active. Practically, the configuration change is reflected immediately after the command execution. However, in some cases of heavy load, the inconsistent state can be prolonged and it might result in counter-intuitive situations for users and developers. It requires a workaround like this: https://github.com/etcd-io/etcd/pull/4317#issuecomment-179037582
### Inconsistent permissions are unsafe for linearized requests

View File

@ -0,0 +1,138 @@
---
title: etcd client design
---
etcd Client Design
==================
*Gyuho Lee (github.com/gyuho, Amazon Web Services, Inc.), Joe Betz (github.com/jpbetz, Google Inc.)*
Introduction
============
etcd server has proven its robustness with years of failure injection testing. Most complex application logic is already handled by etcd server and its data stores (e.g. cluster membership is transparent to clients, with Raft-layer forwarding proposals to leader). Although server components are correct, its composition with client requires a different set of intricate protocols to guarantee its correctness and high availability under faulty conditions. Ideally, etcd server provides one logical cluster view of many physical machines, and client implements automatic failover between replicas. This documents client architectural decisions and its implementation details.
Glossary
========
*clientv3*: etcd Official Go client for etcd v3 API.
*clientv3-grpc1.0*: Official client implementation, with [`grpc-go v1.0.x`](https://github.com/grpc/grpc-go/releases/tag/v1.0.0), which is used in latest etcd v3.1.
*clientv3-grpc1.7*: Official client implementation, with [`grpc-go v1.7.x`](https://github.com/grpc/grpc-go/releases/tag/v1.7.0), which is used in latest etcd v3.2 and v3.3.
*clientv3-grpc1.23*: Official client implementation, with [`grpc-go v1.23.x`](https://github.com/grpc/grpc-go/releases/tag/v1.23.0), which is used in latest etcd v3.4.
*Balancer*: etcd client load balancer that implements retry and failover mechanism. etcd client should automatically balance loads between multiple endpoints.
*Endpoints*: A list of etcd server endpoints that clients can connect to. Typically, 3 or 5 client URLs of an etcd cluster.
*Pinned endpoint*: When configured with multiple endpoints, <= v3.3 client balancer chooses only one endpoint to establish a TCP connection, in order to conserve total open connections to etcd cluster. In v3.4, balancer round-robins pinned endpoints for every request, thus distributing loads more evenly.
*Client Connection*: TCP connection that has been established to an etcd server, via gRPC Dial.
*Sub Connection*: gRPC SubConn interface. Each sub-connection contains a list of addresses. Balancer creates a SubConn from a list of resolved addresses. gRPC ClientConn can map to multiple SubConn (e.g. example.com resolves to `10.10.10.1` and `10.10.10.2` of two sub-connections). etcd v3.4 balancer employs internal resolver to establish one sub-connection for each endpoint.
*Transient disconnect*: When gRPC server returns a status error of [`code Unavailable`](https://godoc.org/google.golang.org/grpc/codes#Code).
Client Requirements
===================
*Correctness*. Requests may fail in the presence of server faults. However, it never violates consistency guarantees: global ordering properties, never write corrupted data, at-most once semantics for mutable operations, watch never observes partial events, and so on.
*Liveness*. Servers may fail or disconnect briefly. Clients should make progress in either way. Clients should [never deadlock](https://github.com/etcd-io/etcd/issues/8980) waiting for a server to come back from offline, unless configured to do so. Ideally, clients detect unavailable servers with HTTP/2 ping and failover to other nodes with clear error messages.
*Effectiveness*. Clients should operate effectively with minimum resources: previous TCP connections should be [gracefully closed](https://github.com/etcd-io/etcd/issues/9212) after endpoint switch. Failover mechanism should effectively predict the next replica to connect, without wastefully retrying on failed nodes.
*Portability*. Official client should be clearly documented and its implementation be applicable to other language bindings. Error handling between different language bindings should be consistent. Since etcd is fully committed to gRPC, implementation should be closely aligned with gRPC long-term design goals (e.g. pluggable retry policy should be compatible with [gRPC retry](https://github.com/grpc/proposal/blob/master/A6-client-retries.md)). Upgrades between two client versions should be non-disruptive.
Client Overview
===============
etcd client implements the following components:
* balancer that establishes gRPC connections to an etcd cluster,
* API client that sends RPCs to an etcd server, and
* error handler that decides whether to retry a failed request or switch endpoints.
Languages may differ in how to establish an initial connection (e.g. configure TLS), how to encode and send Protocol Buffer messages to server, how to handle stream RPCs, and so on. However, errors returned from etcd server will be the same. So should be error handling and retry policy.
For example, etcd server may return `"rpc error: code = Unavailable desc = etcdserver: request timed out"`, which is transient error that expects retries. Or return `rpc error: code = InvalidArgument desc = etcdserver: key is not provided`, which means request was invalid and should not be retried. Go client can parse errors with `google.golang.org/grpc/status.FromError`, and Java client with `io.grpc.Status.fromThrowable`.
clientv3-grpc1.0: Balancer Overview
-----------------------------------
`clientv3-grpc1.0` maintains multiple TCP connections when configured with multiple etcd endpoints. Then pick one address and use it to send all client requests. The pinned address is maintained until the client object is closed (see *Figure 1*). When the client receives an error, it randomly picks another and retries.
![client-balancer-figure-01.png](img/client-balancer-figure-01.png)
clientv3-grpc1.0: Balancer Limitation
-------------------------------------
`clientv3-grpc1.0` opening multiple TCP connections may provide faster balancer failover but requires more resources. The balancer does not understand nodes health status or cluster membership. So, it is possible that balancer gets stuck with one failed or partitioned node.
clientv3-grpc1.7: Balancer Overview
------------------------------------
`clientv3-grpc1.7` maintains only one TCP connection to a chosen etcd server. When given multiple cluster endpoints, a client first tries to connect to them all. As soon as one connection is up, balancer pins the address, closing others (see *Figure 2*). The pinned address is to be maintained until the client object is closed. An error, from server or client network fault, is sent to client error handler (see *Figure 3*).
![client-balancer-figure-02.png](img/client-balancer-figure-02.png)
![client-balancer-figure-03.png](img/client-balancer-figure-03.png)
The client error handler takes an error from gRPC server, and decides whether to retry on the same endpoint, or to switch to other addresses, based on the error code and message (see *Figure 4* and *Figure 5*).
![client-balancer-figure-04.png](img/client-balancer-figure-04.png)
![client-balancer-figure-05.png](img/client-balancer-figure-05.png)
Stream RPCs, such as Watch and KeepAlive, are often requested with no timeouts. Instead, client can send periodic HTTP/2 pings to check the status of a pinned endpoint; if the server does not respond to the ping, balancer switches to other endpoints (see *Figure 6*).
![client-balancer-figure-06.png](img/client-balancer-figure-06.png)
clientv3-grpc1.7: Balancer Limitation
-------------------------------------
`clientv3-grpc1.7` balancer sends HTTP/2 keepalives to detect disconnects from streaming requests. It is a simple gRPC server ping mechanism and does not reason about cluster membership, thus unable to detect network partitions. Since partitioned gRPC server can still respond to client pings, balancer may get stuck with a partitioned node. Ideally, keepalive ping detects partition and triggers endpoint switch, before request time-out (see [etcd#8673](https://github.com/etcd-io/etcd/issues/8673) and *Figure 7*).
![client-balancer-figure-07.png](img/client-balancer-figure-07.png)
`clientv3-grpc1.7` balancer maintains a list of unhealthy endpoints. Disconnected addresses are added to “unhealthy” list, and considered unavailable until after wait duration, which is hard coded as dial timeout with default value 5-second. Balancer can have false positives on which endpoints are unhealthy. For instance, endpoint A may come back right after being blacklisted, but still unusable for next 5 seconds (see *Figure 8*).
`clientv3-grpc1.0` suffered the same problems above.
![client-balancer-figure-08.png](img/client-balancer-figure-08.png)
Upstream gRPC Go had already migrated to new balancer interface. For example, `clientv3-grpc1.7` underlying balancer implementation uses new gRPC balancer and tries to be consistent with old balancer behaviors. While its compatibility has been maintained reasonably well, etcd client still [suffered from subtle breaking changes](https://github.com/grpc/grpc-go/issues/1649). Furthermore, gRPC maintainer recommends to [not rely on the old balancer interface](https://github.com/grpc/grpc-go/issues/1942#issuecomment-375368665). In general, to get better support from upstream, it is best to be in sync with latest gRPC releases. And new features, such as retry policy, may not be backported to gRPC 1.7 branch. Thus, both etcd server and client must migrate to latest gRPC versions.
clientv3-grpc1.23: Balancer Overview
------------------------------------
`clientv3-grpc1.7` is so tightly coupled with old gRPC interface, that every single gRPC dependency upgrade broke client behavior. Majority of development and debugging efforts were devoted to fixing those client behavior changes. As a result, its implementation has become overly complicated with bad assumptions on server connectivities.
The primary goal of `clientv3-grpc1.23` is to simplify balancer failover logic; rather than maintaining a list of unhealthy endpoints, which may be stale, simply roundrobin to the next endpoint whenever client gets disconnected from the current endpoint. It does not assume endpoint status. Thus, no more complicated status tracking is needed (see *Figure 8* and above). Upgrading to `clientv3-grpc1.23` should be no issue; all changes were internal while keeping all the backward compatibilities.
Internally, when given multiple endpoints, `clientv3-grpc1.23` creates multiple sub-connections (one sub-connection per each endpoint), while `clientv3-grpc1.7` creates only one connection to a pinned endpoint (see *Figure 9*). For instance, in 5-node cluster, `clientv3-grpc1.23` balancer would require 5 TCP connections, while `clientv3-grpc1.7` only requires one. By preserving the pool of TCP connections, `clientv3-grpc1.23` may consume more resources but provide more flexible load balancer with better failover performance. The default balancing policy is round robin but can be easily extended to support other types of balancers (e.g. power of two, pick leader, etc.). `clientv3-grpc1.23` uses gRPC resolver group and implements balancer picker policy, in order to delegate complex balancing work to upstream gRPC. On the other hand, `clientv3-grpc1.7` manually handles each gRPC connection and balancer failover, which complicates the implementation. `clientv3-grpc1.23` implements retry in the gRPC interceptor chain that automatically handles gRPC internal errors and enables more advanced retry policies like backoff, while `clientv3-grpc1.7` manually interprets gRPC errors for retries.
![client-balancer-figure-09.png](img/client-balancer-figure-09.png)
clientv3-grpc1.23: Balancer Limitation
--------------------------------------
Improvements can be made by caching the status of each endpoint. For instance, balancer can ping each server in advance to maintain a list of healthy candidates, and use this information when doing round-robin. Or when disconnected, balancer can prioritize healthy endpoints. This may complicate the balancer implementation, thus can be addressed in later versions.
Client-side keepalive ping still does not reason about network partitions. Streaming request may get stuck with a partitioned node. Advanced health checking service need to be implemented to understand the cluster membership (see [etcd#8673](https://github.com/etcd-io/etcd/issues/8673) for more detail).
![client-balancer-figure-07.png](img/client-balancer-figure-07.png)
Currently, retry logic is handled manually as an interceptor. This may be simplified via [official gRPC retries](https://github.com/grpc/proposal/blob/master/A6-client-retries.md).

View File

@ -0,0 +1,128 @@
---
title: etcd learner design
---
etcd Learner
============
*Gyuho Lee (github.com/gyuho, Amazon Web Services, Inc.), Joe Betz (github.com/jpbetz, Google Inc.)*
Background
==========
Membership reconfiguration has been one of the biggest operational challenges. Lets review common challenges.
### 1. New Cluster member overloads Leader
A newly joined etcd member starts with no data, thus demanding more updates from leader until it catches up with leaders logs. Then leaders network is more likely to be overloaded, blocking or dropping leader heartbeats to followers. In such case, a follower may election-timeout to start a new leader election. That is, a cluster with a new member is more vulnerable to leader election. Both leader election and the subsequent update propagation to the new member are prone to causing periods of cluster unavailability (see *Figure 1*).
![server-learner-figure-01](img/server-learner-figure-01.png)
### 2. Network Partitions scenarios
What if network partition happens? It depends on leader partition. If the leader still maintains the active quorum, the cluster would continue to operate (see *Figure 2*).
![server-learner-figure-02](img/server-learner-figure-02.png)
#### 2.1 Leader isolation
What if the leader becomes isolated from the rest of the cluster? Leader monitors progress of each follower. When leader loses connectivity from the quorum, it reverts back to follower which will affect the cluster availability (see *Figure 3*).
![server-learner-figure-03](img/server-learner-figure-03.png)
When a new node is added to 3 node cluster, the cluster size becomes 4 and the quorum size becomes 3. What if a new node had joined the cluster, and then network partition happens? It depends on which partition the new member gets located after partition.
#### 2.2 Cluster Split 3+1
If the new node happens to be located in the same partition as leaders, the leader still maintains the active quorum of 3. No leadership election happens, and no cluster availability gets affected (see *Figure 4*).
![server-learner-figure-04](img/server-learner-figure-04.png)
#### 2.3 Cluster Split 2+2
If the cluster is 2-and-2 partitioned, then neither of partition maintains the quorum of 3. In this case, leadership election happens (see *Figure 5*).
![server-learner-figure-05](img/server-learner-figure-05.png)
#### 2.4 Quorum Lost
What if network partition happens first, and then a new member gets added? A partitioned 3-node cluster already has one disconnected follower. When a new member is added, the quorum changes from 2 to 3. Now, this cluster has only 2 active nodes out 4, thus losing quorum and starting a new leadership election (see *Figure 6*).
![server-learner-figure-06](img/server-learner-figure-06.png)
Since member add operation can change the size of quorum, it is always recommended to “member remove” first to replace an unhealthy node.
Adding a new member to a 1-node cluster changes the quorum size to 2, immediately causing a leader election when the previous leader finds out quorum is not active. This is because “member add” operation is a 2-step process where user needs to apply “member add” command first, and then starts the new node process (see *Figure 7*).
![server-learner-figure-07](img/server-learner-figure-07.png)
### 3. Cluster Misconfigurations
An even worse case is when an added member is misconfigured. Membership reconfiguration is a two-step process: “etcdctl member add” and starting an etcd server process with the given peer URL. That is, “member add” command is applied regardless of URL, even when the URL value is invalid. If the first step is applied with invalid URLs, the second step cannot even start the new etcd. Once the cluster loses quorum, there is no way to revert the membership change (see *Figure 8*).
![server-learner-figure-08](img/server-learner-figure-08.png)
Same applies to a multi-node cluster. For example, the cluster has two members down (one is failed, the other is misconfigured) and two members up, but now it requires at least 3 votes to change the cluster membership (see *Figure 9*).
![server-learner-figure-09](img/server-learner-figure-09.png)
As seen above, a simple misconfiguration can fail the whole cluster into an inoperative state. In such case, an operator need manually recreate the cluster with `etcd --force-new-cluster` flag. As etcd has become a mission-critical service for Kubernetes, even the slightest outage may have significant impact on users. What can we better to make etcd such operations easier? Among other things, leader election is most critical to cluster availability: Can we make membership reconfiguration less disruptive by not changing the size of quorum? Can a new node be idle, only requesting the minimum updates from leader, until it catches up? Can membership misconfiguration be always reversible and handled in a more secure way (wrong member add command run should never fail the cluster)? Should an user worry about network topology when adding a new member? Can member add API work regardless of the location of nodes and ongoing network partitions?
Raft Learner
============
In order to mitigate such availability gaps in the previous section, [Raft §4.2.1](https://github.com/ongardie/dissertation/blob/master/stanford.pdf) introduces a new node state “Learner”, which joins the cluster as a **non-voting member** until it catches up to leaders logs.
Features in v3.4
----------------
An operator should do the minimum amount of work possible to add a new learner node. `member add --learner` command to add a new learner, which joins cluster as a non-voting member but still receives all data from leader (see *Figure 10*).
![server-learner-figure-10](img/server-learner-figure-10.png)
When a learner has caught up with leaders progress, the learner can be promoted to a voting member using `member promote` API, which then counts towards the quorum (see *Figure 11*).
![server-learner-figure-11](img/server-learner-figure-11.png)
etcd server validates promote request to ensure its operational safety. Only after its log has caught up to leaders can learner be promoted to a voting member (see *Figure 12*).
![server-learner-figure-12](img/server-learner-figure-12.png)
Learner only serves as a standby node until promoted: Leadership cannot be transferred to learner. Learner rejects client reads and writes (client balancer should not route requests to learner). Which means learner does not need issue Read Index requests to leader. Such limitation simplifies the initial learner implementation in v3.4 release (see *Figure 13*).
![server-learner-figure-13](img/server-learner-figure-13.png)
In addition, etcd limits the total number of learners that a cluster can have, and avoids overloading the leader with log replication. Learner never promotes itself. While etcd provides learner status information and safety checks, cluster operator must make the final decision whether to promote learner or not.
Features in v3.5
----------------
*Make learner state only and default*: Defaulting a new member state to learner will greatly improve membership reconfiguration safety, because learner does not change the size of quorum. Misconfiguration will always be reversible without losing the quorum.
*Make voting-member promotion fully automatic*: Once a learner catches up to leaders logs, a cluster can automatically promote the learner. etcd requires certain thresholds to be defined by the user, and once the requirements are satisfied, learner promotes itself to a voting member. From a users perspective, “member add” command would work the same way as today but with greater safety provided by learner feature.
*Make learner standby failover node*: A learner joins as a standby node, and gets automatically promoted when the cluster availability is affected.
*Make learner read-only*: A learner can serve as a read-only node that never gets promoted. In a weak consistency mode, learner only receives data from leader and never process writes. Serving reads locally without consensus overhead would greatly decrease the workloads to leader but may serve stale data. In a strong consistency mode, learner requests read index from leader to serve latest data, but still rejects writes.
Learner vs. Mirror Maker
========================
etcd implements “mirror maker” using watch API to continuously relay key creates and updates to a separate cluster. Mirroring usually has low latency overhead once it completes initial synchronization. Learner and mirroring overlap in that both can be used to replicate existing data for read-only. However, mirroring does not guarantee linearizability. During network disconnects, previous key-values might have been discarded, and clients are expected to verify watch responses for correct ordering. Thus, there is no ordering guarantee in mirror. Use mirror for minimum latency (e.g. cross data center) at the costs of consistency. Use learner to retain all historical data and its ordering.
Appendix: Learner Implementation in v3.4
========================================
*Expose "Learner" node type to "MemberAdd" API.*
etcd client adds a flag to “MemberAdd” API for learner node. And etcd server handler applies membership change entry with `pb.ConfChangeAddLearnerNode` type. Once the command has been applied, a server joins the cluster with `etcd --initial-cluster-state=existing` flag. This learner node can neither vote nor count as quorum.
etcd server must not transfer leadership to learner, since it may still lag behind and does not count as quorum. etcd server limits the number of learners that cluster can have to one: the more learners we have, the more data the leader has to propagate. Clients may talk to learner node, but learner rejects all requests other than serializable read and member status API. This is for simplicity of initial implementation. In the future, learner can be extended as a read-only server that continuously mirrors cluster data. Client balancer must provide helper function to exclude learner node endpoint. Otherwise, request sent to learner may fail. Client sync member call should factor into learner node type. So should client endpoints update call.
`MemberList` and `MemberStatus` responses should indicate which node is learner.
*Add "MemberPromote" API.*
Internally in Raft, second `MemberAdd` call to learner node promotes it to a voting member. Leader maintains the progress of each follower and learner. If learner has not completed its snapshot message, reject promote request. Only accept promote request if and only if: The learner node is in a healthy state. The learner is in sync with leader or the delta is within the threshold (e.g. the number of entries to replicate to learner is less than 1/10 of snapshot count, which means it is less likely that even after promotion leader would not need send snapshot to the learner). All these logic are hard-coded in `etcdserver` package and not configurable.
Reference
=========
- Original github issue: [etcd#9161](https://github.com/etcd-io/etcd/issues/9161)
- Use case: [etcd#3715](https://github.com/etcd-io/etcd/issues/3715)
- Use case: [etcd#8888](https://github.com/etcd-io/etcd/issues/8888)
- Use case: [etcd#10114](https://github.com/etcd-io/etcd/issues/10114)

View File

@ -1,4 +1,6 @@
# Glossary
---
title: Glossary
---
This document defines the various terms used in etcd documentation, command line and source code.

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 257 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 218 KiB

View File

Before

Width:  |  Height:  |  Size: 4.3 KiB

After

Width:  |  Height:  |  Size: 4.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 123 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 104 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 145 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 116 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 128 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 153 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 123 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

View File

@ -1,4 +1,6 @@
# etcd versus other key-value stores
---
title: etcd versus other key-value stores
---
The name "etcd" originated from two ideas, the unix "/etc" folder and "d"istributed systems. The "/etc" folder is a place to store configuration data for a single system whereas etcd stores configuration information for large scale distributed systems. Hence, a "d"istributed "/etc" is "etcd".
@ -76,18 +78,18 @@ In theory, its possible to build these primitives atop any storage systems pr
For distributed coordination, choosing etcd can help prevent operational headaches and save engineering effort.
[production-users]: ../production-users.md
[grpc]: http://www.grpc.io
[grpc]: https://www.grpc.io
[consul-bulletproof]: https://www.consul.io/docs/internals/sessions.html
[curator]: http://curator.apache.org/
[cockroach]: https://github.com/cockroachdb/cockroach
[spanner]: https://cloud.google.com/spanner/
[tidb]: https://github.com/pingcap/tidb
[etcd-v3lock]: https://godoc.org/github.com/coreos/etcd/etcdserver/api/v3lock/v3lockpb
[etcd-v3election]: https://godoc.org/github.com/coreos/etcd/etcdserver/api/v3election/v3electionpb
[etcd-v3lock]: https://godoc.org/github.com/etcd-io/etcd/etcdserver/api/v3lock/v3lockpb
[etcd-v3election]: https://godoc.org/github.com/coreos/etcd-io/etcdserver/api/v3election/v3electionpb
[etcd-etcdctl-lock]: ../../etcdctl/README.md#lock-lockname-command-arg1-arg2-
[etcd-etcdctl-elect]: ../../etcdctl/README.md#elect-options-election-name-proposal
[etcd-mvcc]: data_model.md
[etcd-recipe]: https://godoc.org/github.com/coreos/etcd/contrib/recipes
[etcd-recipe]: https://godoc.org/github.com/etcd-io/etcd/contrib/recipes
[consul-lock]: https://www.consul.io/docs/commands/lock.html
[newsql-leader]: http://dl.acm.org/citation.cfm?id=2960999
[etcd-reconfig]: ../op-guide/runtime-configuration.md
@ -112,5 +114,5 @@ For distributed coordination, choosing etcd can help prevent operational headach
[zk-bindings]: https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#ch_bindings
[container-linux]: https://coreos.com/why
[locksmith]: https://github.com/coreos/locksmith
[kubernetes]: http://kubernetes.io/docs/whatisk8s
[kubernetes]: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
[dbtester-comparison-results]: https://github.com/coreos/dbtester/tree/master/test-results/2018Q1-02-etcd-zookeeper-consul

View File

@ -1,4 +1,6 @@
# Metrics
---
title: Metrics
---
etcd uses [Prometheus][prometheus] for metrics reporting. The metrics can be used for real-time monitoring and debugging. etcd does not persist its metrics; if a member restarts, the metrics will be reset.
@ -99,7 +101,7 @@ Abnormally high snapshot duration (`snapshot_save_total_duration_seconds`) indic
## Prometheus supplied metrics
The Prometheus client library provides a number of metrics under the `go` and `process` namespaces. There are a few that are particlarly interesting.
The Prometheus client library provides a number of metrics under the `go` and `process` namespaces. There are a few that are particularly interesting.
| Name | Description | Type |
|-----------------------------------|--------------------------------------------|--------------|

View File

@ -465,7 +465,7 @@ etcd_network_client_grpc_sent_bytes_total
etcd_network_peer_received_bytes_total{From="REMOTE_PEER_NODE_ID"}
# name: "etcd_network_peer_round_trip_time_seconds"
# description: "Round-Trip-Time histogram between peers."
# description: "Round-Trip-Time histogram between peers"
# type: "histogram"
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="+Inf"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0001"}
@ -492,16 +492,37 @@ etcd_network_peer_round_trip_time_seconds_sum{To="REMOTE_PEER_NODE_ID"}
# type: "counter"
etcd_network_peer_sent_bytes_total{To="REMOTE_PEER_NODE_ID"}
# name: "etcd_server_go_version"
# description: "Which Go version server is running with. 1 for 'server_go_version' label with current version."
# type: "gauge"
etcd_server_go_version{server_go_version="go1.11.1"}
# name: "etcd_server_has_leader"
# description: "Whether or not a leader exists. 1 is existence, 0 is not."
# type: "gauge"
etcd_server_has_leader
# name: "etcd_server_health_failures"
# description: "The total number of failed health checks"
# type: "counter"
etcd_server_health_failures
# name: "etcd_server_health_success"
# description: "The total number of successful health checks"
# type: "counter"
etcd_server_health_success
# name: "etcd_server_heartbeat_send_failures_total"
# description: "The total number of leader heartbeat send failures (likely overloaded from slow disk)."
# type: "counter"
etcd_server_heartbeat_send_failures_total
# name: "etcd_server_id"
# description: "Server or member ID in hexadecimal format. 1 for 'server_id' label with current ID."
# type: "gauge"
etcd_server_id{server_id="221ed218fa16d1fe"}
etcd_server_id{server_id="3f1ca865609d428d"}
# name: "etcd_server_is_leader"
# description: "Whether or not this member is a leader. 1 if is, 0 otherwise."
# type: "gauge"
@ -512,6 +533,21 @@ etcd_server_is_leader
# type: "counter"
etcd_server_leader_changes_seen_total
# name: "etcd_server_is_learner"
# description: "Whether or not this member is a learner. 1 if is, 0 otherwise."
# type: "gauge"
etcd_server_is_learner
# name: "etcd_server_learner_promote_failures"
# description: "The total number of failed learner promotions (likely learner not ready) while this member is leader."
# type: "counter"
etcd_server_learner_promote_failures
# name: "etcd_server_learner_promote_successes"
# description: "The total number of successful learner promotions while this member is leader."
# type: "counter"
etcd_server_learner_promote_successes
# name: "etcd_server_proposals_applied_total"
# description: "The total number of consensus proposals applied."
# type: "gauge"
@ -537,6 +573,11 @@ etcd_server_proposals_pending
# type: "gauge"
etcd_server_quota_backend_bytes
# name: "etcd_server_read_indexes_failed_total"
# description: "The total number of failed read indexes seen."
# type: "counter"
etcd_server_read_indexes_failed_total
# name: "etcd_server_slow_apply_total"
# description: "The total number of slow apply requests (likely overloaded from slow disk)."
# type: "counter"
@ -552,6 +593,44 @@ etcd_server_slow_read_indexes_total
# type: "gauge"
etcd_server_version{server_version="3.3.0+git"}
# name: "etcd_snap_db_fsync_duration_seconds"
# description: "The latency distributions of fsyncing .snap.db file"
# type: "histogram"
etcd_snap_db_fsync_duration_seconds_bucket{le="0.001"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.002"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.004"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.008"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.016"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.032"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.064"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.128"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.256"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.512"}
etcd_snap_db_fsync_duration_seconds_bucket{le="1.024"}
etcd_snap_db_fsync_duration_seconds_bucket{le="2.048"}
etcd_snap_db_fsync_duration_seconds_bucket{le="4.096"}
etcd_snap_db_fsync_duration_seconds_bucket{le="8.192"}
etcd_snap_db_fsync_duration_seconds_bucket{le="+Inf"}
etcd_snap_db_fsync_duration_seconds_sum
etcd_snap_db_fsync_duration_seconds_count
# name: "etcd_snap_db_save_total_duration_seconds"
# description: "The total latency distributions of v3 snapshot save"
# type: "histogram"
etcd_snap_db_save_total_duration_seconds_bucket{le="0.1"}
etcd_snap_db_save_total_duration_seconds_bucket{le="0.2"}
etcd_snap_db_save_total_duration_seconds_bucket{le="0.4"}
etcd_snap_db_save_total_duration_seconds_bucket{le="0.8"}
etcd_snap_db_save_total_duration_seconds_bucket{le="1.6"}
etcd_snap_db_save_total_duration_seconds_bucket{le="3.2"}
etcd_snap_db_save_total_duration_seconds_bucket{le="6.4"}
etcd_snap_db_save_total_duration_seconds_bucket{le="12.8"}
etcd_snap_db_save_total_duration_seconds_bucket{le="25.6"}
etcd_snap_db_save_total_duration_seconds_bucket{le="51.2"}
etcd_snap_db_save_total_duration_seconds_bucket{le="+Inf"}
etcd_snap_db_save_total_duration_seconds_sum
etcd_snap_db_save_total_duration_seconds_count
# name: "etcd_snap_fsync_duration_seconds"
# description: "The latency distributions of fsync called by snap."
# type: "histogram"

View File

@ -0,0 +1,795 @@
# server version: etcd_server_version{server_version="3.1.20"}
# name: "etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds"
# description: "Bucketed histogram of db compaction pause duration."
# type: "histogram"
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="8"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="16"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="32"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="64"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="128"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="256"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="512"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1024"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2048"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4096"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="+Inf"}
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_sum
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_count
# name: "etcd_debugging_mvcc_db_compaction_total_duration_milliseconds"
# description: "Bucketed histogram of db compaction total duration."
# type: "histogram"
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="100"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="200"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="400"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="800"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="1600"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="3200"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="6400"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="12800"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="25600"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="51200"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="102400"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="204800"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="409600"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="819200"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket{le="+Inf"}
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_sum
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_count
# name: "etcd_debugging_mvcc_db_total_size_in_bytes"
# description: "Total size of the underlying database physically allocated in bytes. Use etcd_mvcc_db_total_size_in_bytes"
# type: "gauge"
etcd_debugging_mvcc_db_total_size_in_bytes
# name: "etcd_debugging_mvcc_delete_total"
# description: "Total number of deletes seen by this member."
# type: "counter"
etcd_debugging_mvcc_delete_total
# name: "etcd_debugging_mvcc_events_total"
# description: "Total number of events sent by this member."
# type: "counter"
etcd_debugging_mvcc_events_total
# name: "etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds"
# description: "Bucketed histogram of index compaction pause duration."
# type: "histogram"
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="0.5"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="2"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="4"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="8"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="16"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="32"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="64"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="128"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="256"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="512"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="1024"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket{le="+Inf"}
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_sum
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_count
# name: "etcd_debugging_mvcc_keys_total"
# description: "Total number of keys."
# type: "gauge"
etcd_debugging_mvcc_keys_total
# name: "etcd_debugging_mvcc_pending_events_total"
# description: "Total number of pending events to be sent."
# type: "gauge"
etcd_debugging_mvcc_pending_events_total
# name: "etcd_debugging_mvcc_put_total"
# description: "Total number of puts seen by this member."
# type: "counter"
etcd_debugging_mvcc_put_total
# name: "etcd_debugging_mvcc_range_total"
# description: "Total number of ranges seen by this member."
# type: "counter"
etcd_debugging_mvcc_range_total
# name: "etcd_debugging_mvcc_slow_watcher_total"
# description: "Total number of unsynced slow watchers."
# type: "gauge"
etcd_debugging_mvcc_slow_watcher_total
# name: "etcd_debugging_mvcc_txn_total"
# description: "Total number of txns seen by this member."
# type: "counter"
etcd_debugging_mvcc_txn_total
# name: "etcd_debugging_mvcc_watch_stream_total"
# description: "Total number of watch streams."
# type: "gauge"
etcd_debugging_mvcc_watch_stream_total
# name: "etcd_debugging_mvcc_watcher_total"
# description: "Total number of watchers."
# type: "gauge"
etcd_debugging_mvcc_watcher_total
# name: "etcd_debugging_snap_save_marshalling_duration_seconds"
# description: "The marshalling cost distributions of save called by snapshot."
# type: "histogram"
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.001"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.002"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.004"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.008"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.016"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.032"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.064"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.128"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.256"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="0.512"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="1.024"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="2.048"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="4.096"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="8.192"}
etcd_debugging_snap_save_marshalling_duration_seconds_bucket{le="+Inf"}
etcd_debugging_snap_save_marshalling_duration_seconds_sum
etcd_debugging_snap_save_marshalling_duration_seconds_count
# name: "etcd_debugging_snap_save_total_duration_seconds"
# description: "The total latency distributions of save called by snapshot."
# type: "histogram"
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.001"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.002"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.004"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.008"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.016"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.032"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.064"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.128"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.256"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="0.512"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="1.024"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="2.048"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="4.096"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="8.192"}
etcd_debugging_snap_save_total_duration_seconds_bucket{le="+Inf"}
etcd_debugging_snap_save_total_duration_seconds_sum
etcd_debugging_snap_save_total_duration_seconds_count
# name: "etcd_debugging_store_expires_total"
# description: "Total number of expired keys."
# type: "counter"
etcd_debugging_store_expires_total
# name: "etcd_debugging_store_reads_total"
# description: "Total number of reads action by (get/getRecursive), local to this member."
# type: "counter"
etcd_debugging_store_reads_total{action="getRecursive"}
# name: "etcd_debugging_store_watch_requests_total"
# description: "Total number of incoming watch requests (new or reestablished)."
# type: "counter"
etcd_debugging_store_watch_requests_total
# name: "etcd_debugging_store_watchers"
# description: "Count of currently active watchers."
# type: "gauge"
etcd_debugging_store_watchers
# name: "etcd_debugging_store_writes_total"
# description: "Total number of writes (e.g. set/compareAndDelete) seen by this member."
# type: "counter"
etcd_debugging_store_writes_total{action="create"}
etcd_debugging_store_writes_total{action="set"}
# name: "etcd_disk_backend_commit_duration_seconds"
# description: "The latency distributions of commit called by backend."
# type: "histogram"
etcd_disk_backend_commit_duration_seconds_bucket{le="0.001"}
etcd_disk_backend_commit_duration_seconds_bucket{le="0.002"}
etcd_disk_backend_commit_duration_seconds_bucket{le="0.004"}
etcd_disk_backend_commit_duration_seconds_bucket{le="0.008"}
etcd_disk_backend_commit_duration_seconds_bucket{le="0.016"}
etcd_disk_backend_commit_duration_seconds_bucket{le="0.032"}
etcd_disk_backend_commit_duration_seconds_bucket{le="0.064"}
etcd_disk_backend_commit_duration_seconds_bucket{le="0.128"}
etcd_disk_backend_commit_duration_seconds_bucket{le="0.256"}
etcd_disk_backend_commit_duration_seconds_bucket{le="0.512"}
etcd_disk_backend_commit_duration_seconds_bucket{le="1.024"}
etcd_disk_backend_commit_duration_seconds_bucket{le="2.048"}
etcd_disk_backend_commit_duration_seconds_bucket{le="4.096"}
etcd_disk_backend_commit_duration_seconds_bucket{le="8.192"}
etcd_disk_backend_commit_duration_seconds_bucket{le="+Inf"}
etcd_disk_backend_commit_duration_seconds_sum
etcd_disk_backend_commit_duration_seconds_count
# name: "etcd_disk_backend_defrag_duration_seconds"
# description: "The latency distribution of backend defragmentation."
# type: "histogram"
etcd_disk_backend_defrag_duration_seconds_bucket{le="0.1"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="0.2"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="0.4"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="0.8"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="1.6"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="3.2"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="6.4"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="12.8"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="25.6"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="51.2"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="102.4"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="204.8"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="409.6"}
etcd_disk_backend_defrag_duration_seconds_bucket{le="+Inf"}
etcd_disk_backend_defrag_duration_seconds_sum
etcd_disk_backend_defrag_duration_seconds_count
# name: "etcd_disk_wal_fsync_duration_seconds"
# description: "The latency distributions of fsync called by wal."
# type: "histogram"
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.001"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.002"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.004"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.008"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.016"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.032"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.064"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.128"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.256"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="0.512"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="1.024"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="2.048"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="4.096"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="8.192"}
etcd_disk_wal_fsync_duration_seconds_bucket{le="+Inf"}
etcd_disk_wal_fsync_duration_seconds_sum
etcd_disk_wal_fsync_duration_seconds_count
# name: "etcd_grpc_proxy_cache_hits_total"
# description: "Total number of cache hits"
# type: "gauge"
etcd_grpc_proxy_cache_hits_total
# name: "etcd_grpc_proxy_cache_misses_total"
# description: "Total number of cache misses"
# type: "gauge"
etcd_grpc_proxy_cache_misses_total
# name: "etcd_grpc_proxy_events_coalescing_total"
# description: "Total number of events coalescing"
# type: "counter"
etcd_grpc_proxy_events_coalescing_total
# name: "etcd_grpc_proxy_watchers_coalescing_total"
# description: "Total number of current watchers coalescing"
# type: "gauge"
etcd_grpc_proxy_watchers_coalescing_total
# name: "etcd_mvcc_db_total_size_in_bytes"
# description: "Total size of the underlying database physically allocated in bytes."
# type: "gauge"
etcd_mvcc_db_total_size_in_bytes
# name: "etcd_mvcc_db_total_size_in_use_in_bytes"
# description: "Total size of the underlying database logically in use in bytes."
# type: "gauge"
etcd_mvcc_db_total_size_in_use_in_bytes
# name: "etcd_mvcc_hash_duration_seconds"
# description: "The latency distribution of storage hash operation."
# type: "histogram"
etcd_mvcc_hash_duration_seconds_bucket{le="0.01"}
etcd_mvcc_hash_duration_seconds_bucket{le="0.02"}
etcd_mvcc_hash_duration_seconds_bucket{le="0.04"}
etcd_mvcc_hash_duration_seconds_bucket{le="0.08"}
etcd_mvcc_hash_duration_seconds_bucket{le="0.16"}
etcd_mvcc_hash_duration_seconds_bucket{le="0.32"}
etcd_mvcc_hash_duration_seconds_bucket{le="0.64"}
etcd_mvcc_hash_duration_seconds_bucket{le="1.28"}
etcd_mvcc_hash_duration_seconds_bucket{le="2.56"}
etcd_mvcc_hash_duration_seconds_bucket{le="5.12"}
etcd_mvcc_hash_duration_seconds_bucket{le="10.24"}
etcd_mvcc_hash_duration_seconds_bucket{le="20.48"}
etcd_mvcc_hash_duration_seconds_bucket{le="40.96"}
etcd_mvcc_hash_duration_seconds_bucket{le="81.92"}
etcd_mvcc_hash_duration_seconds_bucket{le="163.84"}
etcd_mvcc_hash_duration_seconds_bucket{le="+Inf"}
etcd_mvcc_hash_duration_seconds_sum
etcd_mvcc_hash_duration_seconds_count
# name: "etcd_network_client_grpc_received_bytes_total"
# description: "The total number of bytes received from grpc clients."
# type: "counter"
etcd_network_client_grpc_received_bytes_total
# name: "etcd_network_client_grpc_sent_bytes_total"
# description: "The total number of bytes sent to grpc clients."
# type: "counter"
etcd_network_client_grpc_sent_bytes_total
# name: "etcd_network_peer_received_bytes_total"
# description: "The total number of bytes received from peers."
# type: "counter"
etcd_network_peer_received_bytes_total{From="REMOTE_PEER_NODE_ID"}
# name: "etcd_network_peer_round_trip_time_seconds"
# description: "Round-Trip-Time histogram between peers."
# type: "histogram"
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="+Inf"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0001"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0002"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0004"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0008"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0016"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0032"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0064"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0128"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0256"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.0512"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.1024"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.2048"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.4096"}
etcd_network_peer_round_trip_time_seconds_bucket{To="REMOTE_PEER_NODE_ID",le="0.8192"}
etcd_network_peer_round_trip_time_seconds_count{To="REMOTE_PEER_NODE_ID"}
etcd_network_peer_round_trip_time_seconds_sum{To="REMOTE_PEER_NODE_ID"}
# name: "etcd_network_peer_sent_bytes_total"
# description: "The total number of bytes sent to peers."
# type: "counter"
etcd_network_peer_sent_bytes_total{To="REMOTE_PEER_NODE_ID"}
# name: "etcd_server_go_version"
# description: "Which Go version server is running with. 1 for 'server_go_version' label with current version."
# type: "gauge"
etcd_server_go_version{server_go_version="go1.8.7"}
# name: "etcd_server_has_leader"
# description: "Whether or not a leader exists. 1 is existence, 0 is not."
# type: "gauge"
etcd_server_has_leader
# name: "etcd_server_health_failures"
# description: "The total number of failed health checks"
# type: "counter"
etcd_server_health_failures
# name: "etcd_server_health_success"
# description: "The total number of successful health checks"
# type: "counter"
etcd_server_health_success
# name: "etcd_server_heartbeat_send_failures_total"
# description: "The total number of leader heartbeat send failures (likely overloaded from slow disk)."
# type: "counter"
etcd_server_heartbeat_send_failures_total
# name: "etcd_server_id"
# description: "Server or member ID in hexadecimal format. 1 for 'server_id' label with current ID."
# type: "gauge"
etcd_server_id{server_id="7339c4e5e833c029"}
# name: "etcd_server_is_leader"
# description: "Whether or not this member is a leader. 1 if is, 0 otherwise."
# type: "gauge"
etcd_server_is_leader
# name: "etcd_server_leader_changes_seen_total"
# description: "The number of leader changes seen."
# type: "counter"
etcd_server_leader_changes_seen_total
# name: "etcd_server_proposals_applied_total"
# description: "The total number of consensus proposals applied."
# type: "gauge"
etcd_server_proposals_applied_total
# name: "etcd_server_proposals_committed_total"
# description: "The total number of consensus proposals committed."
# type: "gauge"
etcd_server_proposals_committed_total
# name: "etcd_server_proposals_failed_total"
# description: "The total number of failed proposals seen."
# type: "counter"
etcd_server_proposals_failed_total
# name: "etcd_server_proposals_pending"
# description: "The current number of pending proposals to commit."
# type: "gauge"
etcd_server_proposals_pending
# name: "etcd_server_quota_backend_bytes"
# description: "Current backend storage quota size in bytes."
# type: "gauge"
etcd_server_quota_backend_bytes
# name: "etcd_server_read_indexes_failed_total"
# description: "The total number of failed read indexes seen."
# type: "counter"
etcd_server_read_indexes_failed_total
# name: "etcd_server_slow_apply_total"
# description: "The total number of slow apply requests (likely overloaded from slow disk)."
# type: "counter"
etcd_server_slow_apply_total
# name: "etcd_server_slow_read_indexes_total"
# description: "The total number of pending read indexes not in sync with leader's or timed out read index requests."
# type: "counter"
etcd_server_slow_read_indexes_total
# name: "etcd_server_version"
# description: "Which version is running. 1 for 'server_version' label with current version."
# type: "gauge"
etcd_server_version{server_version="3.1.20"}
# name: "etcd_snap_db_fsync_duration_seconds"
# description: "The latency distributions of fsyncing .snap.db file"
# type: "histogram"
etcd_snap_db_fsync_duration_seconds_bucket{le="0.001"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.002"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.004"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.008"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.016"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.032"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.064"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.128"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.256"}
etcd_snap_db_fsync_duration_seconds_bucket{le="0.512"}
etcd_snap_db_fsync_duration_seconds_bucket{le="1.024"}
etcd_snap_db_fsync_duration_seconds_bucket{le="2.048"}
etcd_snap_db_fsync_duration_seconds_bucket{le="4.096"}
etcd_snap_db_fsync_duration_seconds_bucket{le="8.192"}
etcd_snap_db_fsync_duration_seconds_bucket{le="+Inf"}
etcd_snap_db_fsync_duration_seconds_sum
etcd_snap_db_fsync_duration_seconds_count
# name: "etcd_snap_db_save_total_duration_seconds"
# description: "The total latency distributions of v3 snapshot save"
# type: "histogram"
etcd_snap_db_save_total_duration_seconds_bucket{le="0.1"}
etcd_snap_db_save_total_duration_seconds_bucket{le="0.2"}
etcd_snap_db_save_total_duration_seconds_bucket{le="0.4"}
etcd_snap_db_save_total_duration_seconds_bucket{le="0.8"}
etcd_snap_db_save_total_duration_seconds_bucket{le="1.6"}
etcd_snap_db_save_total_duration_seconds_bucket{le="3.2"}
etcd_snap_db_save_total_duration_seconds_bucket{le="6.4"}
etcd_snap_db_save_total_duration_seconds_bucket{le="12.8"}
etcd_snap_db_save_total_duration_seconds_bucket{le="25.6"}
etcd_snap_db_save_total_duration_seconds_bucket{le="51.2"}
etcd_snap_db_save_total_duration_seconds_bucket{le="+Inf"}
etcd_snap_db_save_total_duration_seconds_sum
etcd_snap_db_save_total_duration_seconds_count
# name: "go_gc_duration_seconds"
# description: "A summary of the GC invocation durations."
# type: "summary"
go_gc_duration_seconds{quantile="0"}
go_gc_duration_seconds{quantile="0.25"}
go_gc_duration_seconds{quantile="0.5"}
go_gc_duration_seconds{quantile="0.75"}
go_gc_duration_seconds{quantile="1"}
go_gc_duration_seconds_sum
go_gc_duration_seconds_count
# name: "go_goroutines"
# description: "Number of goroutines that currently exist."
# type: "gauge"
go_goroutines
# name: "go_memstats_alloc_bytes"
# description: "Number of bytes allocated and still in use."
# type: "gauge"
go_memstats_alloc_bytes
# name: "go_memstats_alloc_bytes_total"
# description: "Total number of bytes allocated, even if freed."
# type: "counter"
go_memstats_alloc_bytes_total
# name: "go_memstats_buck_hash_sys_bytes"
# description: "Number of bytes used by the profiling bucket hash table."
# type: "gauge"
go_memstats_buck_hash_sys_bytes
# name: "go_memstats_frees_total"
# description: "Total number of frees."
# type: "counter"
go_memstats_frees_total
# name: "go_memstats_gc_sys_bytes"
# description: "Number of bytes used for garbage collection system metadata."
# type: "gauge"
go_memstats_gc_sys_bytes
# name: "go_memstats_heap_alloc_bytes"
# description: "Number of heap bytes allocated and still in use."
# type: "gauge"
go_memstats_heap_alloc_bytes
# name: "go_memstats_heap_idle_bytes"
# description: "Number of heap bytes waiting to be used."
# type: "gauge"
go_memstats_heap_idle_bytes
# name: "go_memstats_heap_inuse_bytes"
# description: "Number of heap bytes that are in use."
# type: "gauge"
go_memstats_heap_inuse_bytes
# name: "go_memstats_heap_objects"
# description: "Number of allocated objects."
# type: "gauge"
go_memstats_heap_objects
# name: "go_memstats_heap_released_bytes_total"
# description: "Total number of heap bytes released to OS."
# type: "counter"
go_memstats_heap_released_bytes_total
# name: "go_memstats_heap_sys_bytes"
# description: "Number of heap bytes obtained from system."
# type: "gauge"
go_memstats_heap_sys_bytes
# name: "go_memstats_last_gc_time_seconds"
# description: "Number of seconds since 1970 of last garbage collection."
# type: "gauge"
go_memstats_last_gc_time_seconds
# name: "go_memstats_lookups_total"
# description: "Total number of pointer lookups."
# type: "counter"
go_memstats_lookups_total
# name: "go_memstats_mallocs_total"
# description: "Total number of mallocs."
# type: "counter"
go_memstats_mallocs_total
# name: "go_memstats_mcache_inuse_bytes"
# description: "Number of bytes in use by mcache structures."
# type: "gauge"
go_memstats_mcache_inuse_bytes
# name: "go_memstats_mcache_sys_bytes"
# description: "Number of bytes used for mcache structures obtained from system."
# type: "gauge"
go_memstats_mcache_sys_bytes
# name: "go_memstats_mspan_inuse_bytes"
# description: "Number of bytes in use by mspan structures."
# type: "gauge"
go_memstats_mspan_inuse_bytes
# name: "go_memstats_mspan_sys_bytes"
# description: "Number of bytes used for mspan structures obtained from system."
# type: "gauge"
go_memstats_mspan_sys_bytes
# name: "go_memstats_next_gc_bytes"
# description: "Number of heap bytes when next garbage collection will take place."
# type: "gauge"
go_memstats_next_gc_bytes
# name: "go_memstats_other_sys_bytes"
# description: "Number of bytes used for other system allocations."
# type: "gauge"
go_memstats_other_sys_bytes
# name: "go_memstats_stack_inuse_bytes"
# description: "Number of bytes in use by the stack allocator."
# type: "gauge"
go_memstats_stack_inuse_bytes
# name: "go_memstats_stack_sys_bytes"
# description: "Number of bytes obtained from system for stack allocator."
# type: "gauge"
go_memstats_stack_sys_bytes
# name: "go_memstats_sys_bytes"
# description: "Number of bytes obtained by system. Sum of all system allocations."
# type: "gauge"
go_memstats_sys_bytes
# name: "grpc_server_handled_total"
# description: "Total number of RPCs completed on the server, regardless of success or failure."
# type: "counter"
# gRPC codes:
# - "Aborted"
# - "AlreadyExists"
# - "Canceled"
# - "DataLoss"
# - "DeadlineExceeded"
# - "FailedPrecondition"
# - "Internal"
# - "InvalidArgument"
# - "NotFound"
# - "OK"
# - "OutOfRange"
# - "PermissionDenied"
# - "ResourceExhausted"
# - "Unauthenticated"
# - "Unavailable"
# - "Unimplemented"
# - "Unknown"
grpc_server_handled_total{grpc_code="CODE",grpc_method="Alarm",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="AuthDisable",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="AuthEnable",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="Authenticate",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="Compact",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="Defragment",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="DeleteRange",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="Hash",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="LeaseGrant",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="LeaseKeepAlive",grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="LeaseRevoke",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="LeaseTimeToLive",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="MemberAdd",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="MemberList",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="MemberRemove",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="MemberUpdate",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="Put",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="Range",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="RoleAdd",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="RoleDelete",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="RoleGet",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="RoleGrantPermission",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="RoleList",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="RoleRevokePermission",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="Snapshot",grpc_service="etcdserverpb.Maintenance",grpc_type="server_stream"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="Status",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="Txn",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="UserAdd",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="UserChangePassword",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="UserDelete",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="UserGet",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="UserGrantRole",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="UserList",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="UserRevokeRole",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_handled_total{grpc_code="CODE",grpc_method="Watch",grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}
# name: "grpc_server_msg_received_total"
# description: "Total number of RPC stream messages received on the server."
# type: "counter"
grpc_server_msg_received_total{grpc_method="Alarm",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="AuthDisable",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="AuthEnable",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="Authenticate",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="Compact",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="Defragment",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="DeleteRange",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="Hash",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="LeaseGrant",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="LeaseKeepAlive",grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}
grpc_server_msg_received_total{grpc_method="LeaseRevoke",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="LeaseTimeToLive",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="MemberAdd",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="MemberList",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="MemberRemove",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="MemberUpdate",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="Put",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="Range",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="RoleAdd",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="RoleDelete",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="RoleGet",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="RoleGrantPermission",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="RoleList",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="RoleRevokePermission",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="Snapshot",grpc_service="etcdserverpb.Maintenance",grpc_type="server_stream"}
grpc_server_msg_received_total{grpc_method="Status",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="Txn",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="UserAdd",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="UserChangePassword",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="UserDelete",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="UserGet",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="UserGrantRole",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="UserList",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="UserRevokeRole",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_received_total{grpc_method="Watch",grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}
# name: "grpc_server_msg_sent_total"
# description: "Total number of gRPC stream messages sent by the server."
# type: "counter"
grpc_server_msg_sent_total{grpc_method="Alarm",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="AuthDisable",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="AuthEnable",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="Authenticate",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="Compact",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="Defragment",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="DeleteRange",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="Hash",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="LeaseGrant",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="LeaseKeepAlive",grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}
grpc_server_msg_sent_total{grpc_method="LeaseRevoke",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="LeaseTimeToLive",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="MemberAdd",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="MemberList",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="MemberRemove",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="MemberUpdate",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="Put",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="Range",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="RoleAdd",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="RoleDelete",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="RoleGet",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="RoleGrantPermission",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="RoleList",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="RoleRevokePermission",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="Snapshot",grpc_service="etcdserverpb.Maintenance",grpc_type="server_stream"}
grpc_server_msg_sent_total{grpc_method="Status",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="Txn",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="UserAdd",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="UserChangePassword",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="UserDelete",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="UserGet",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="UserGrantRole",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="UserList",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="UserRevokeRole",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_msg_sent_total{grpc_method="Watch",grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}
# name: "grpc_server_started_total"
# description: "Total number of RPCs started on the server."
# type: "counter"
grpc_server_started_total{grpc_method="Alarm",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_started_total{grpc_method="AuthDisable",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="AuthEnable",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="Authenticate",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="Compact",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_started_total{grpc_method="Defragment",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_started_total{grpc_method="DeleteRange",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_started_total{grpc_method="Hash",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_started_total{grpc_method="LeaseGrant",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_started_total{grpc_method="LeaseKeepAlive",grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}
grpc_server_started_total{grpc_method="LeaseRevoke",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_started_total{grpc_method="LeaseTimeToLive",grpc_service="etcdserverpb.Lease",grpc_type="unary"}
grpc_server_started_total{grpc_method="MemberAdd",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_started_total{grpc_method="MemberList",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_started_total{grpc_method="MemberRemove",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_started_total{grpc_method="MemberUpdate",grpc_service="etcdserverpb.Cluster",grpc_type="unary"}
grpc_server_started_total{grpc_method="Put",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_started_total{grpc_method="Range",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_started_total{grpc_method="RoleAdd",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="RoleDelete",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="RoleGet",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="RoleGrantPermission",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="RoleList",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="RoleRevokePermission",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="Snapshot",grpc_service="etcdserverpb.Maintenance",grpc_type="server_stream"}
grpc_server_started_total{grpc_method="Status",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"}
grpc_server_started_total{grpc_method="Txn",grpc_service="etcdserverpb.KV",grpc_type="unary"}
grpc_server_started_total{grpc_method="UserAdd",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="UserChangePassword",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="UserDelete",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="UserGet",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="UserGrantRole",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="UserList",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="UserRevokeRole",grpc_service="etcdserverpb.Auth",grpc_type="unary"}
grpc_server_started_total{grpc_method="Watch",grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}
# name: "http_request_duration_microseconds"
# description: "The HTTP request latencies in microseconds."
# type: "summary"
http_request_duration_microseconds{handler="prometheus",quantile="0.5"}
http_request_duration_microseconds{handler="prometheus",quantile="0.9"}
http_request_duration_microseconds{handler="prometheus",quantile="0.99"}
http_request_duration_microseconds_sum{handler="prometheus"}
http_request_duration_microseconds_count{handler="prometheus"}
# name: "http_request_size_bytes"
# description: "The HTTP request sizes in bytes."
# type: "summary"
http_request_size_bytes{handler="prometheus",quantile="0.5"}
http_request_size_bytes{handler="prometheus",quantile="0.9"}
http_request_size_bytes{handler="prometheus",quantile="0.99"}
http_request_size_bytes_sum{handler="prometheus"}
http_request_size_bytes_count{handler="prometheus"}
# name: "http_response_size_bytes"
# description: "The HTTP response sizes in bytes."
# type: "summary"
http_response_size_bytes{handler="prometheus",quantile="0.5"}
http_response_size_bytes{handler="prometheus",quantile="0.9"}
http_response_size_bytes{handler="prometheus",quantile="0.99"}
http_response_size_bytes_sum{handler="prometheus"}
http_response_size_bytes_count{handler="prometheus"}

Some files were not shown because too many files have changed in this diff Show More