Compare commits

...

162 Commits

Author SHA1 Message Date
2373ddb445 version: bump up to 3.1.14 2018-04-24 13:23:45 -07:00
5da3a7232f Merge pull request #9606 from jpbetz/automated-cherry-pick-of-#9587-release-3.1
Automated cherry pick of #9587
2018-04-23 12:48:23 -07:00
3865d69db3 etcdserver: add is_leader prometheus metric that is 1 on the leader.
Before this change, we had now way to find a leader using /metrics
endpoint. This commit adds a metric to do that.
2018-04-23 11:10:12 -07:00
c764878701 etcdmain: fix "InitialElectionTickAdvance"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-23 11:09:07 -07:00
e66af56730 etcdserver: log skipping initial election tick
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-23 11:07:51 -07:00
097a653945 etcdmain: add "--initial-election-tick-advance"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-23 11:07:36 -07:00
d2673cec81 embed: add "InitialElectionTickAdvance"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-23 11:06:36 -07:00
4e63906c33 integration: set InitialElectionTickAdvance to true by default
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-23 11:05:15 -07:00
0c0bf3f1a5 etcdserver: add "InitialElectionTickAdvance"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-23 11:03:16 -07:00
16487395e1 test: simplify CI tests
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-12 19:05:34 -07:00
c6ae68d3f7 travis.yml: update, remove go tip tests
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-12 11:15:37 -07:00
3b6bd6eea6 tests: move Semaphore script
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-09 11:36:23 -07:00
b5ae9b6879 version: bump up to 3.1.13+git 2018-03-29 10:56:52 -07:00
1558170293 version: bump up to 3.1.13 2018-03-29 10:28:55 -07:00
c3a14a2b28 semaphore: run release test with v3.1.12
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-29 09:23:15 -07:00
6f75c56c5e etcdserver: Manually backport etcdserver/raft.go tickMu fix to 3.1 2018-03-28 12:40:07 -07:00
908c0f4f98 rafthttp: add missing "peer_sent_failures_total" metrics call
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 12:39:59 -07:00
35c6ea7a67 Documentation/upgrades: backport all upgrade guides
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 12:39:59 -07:00
8eeab582d0 etcdserver: adjust election ticks on restart
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 10:17:30 -07:00
c536205249 etcdserver: make "advanceTicks" method
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 10:05:02 -07:00
2e57d99a2c rafthttp: add "ActivePeers" to "Transport"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-28 10:02:13 -07:00
2fdc4aa06c version: bump up to 3.1.12+git 2018-03-08 14:17:36 -08:00
918698add7 version: bump up to 3.1.12 2018-03-08 13:01:30 -08:00
00d14cfd03 test: fix etcd-tester flags
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-08 08:24:04 -08:00
6e11a79fd8 *: remove unused env vars
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-08 01:36:54 -08:00
028f99b103 test: fix "functional-tester" build script
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-08 01:29:57 -08:00
290fa0c1be *: sync "functional-tester" from 3.2 branch
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-08 00:57:30 -08:00
0874fcbed4 test: run "functional" tests in 3.1 branch
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-08 00:26:52 -08:00
7a148fee36 Merge pull request #9401 from jpbetz/automated-cherry-pick-of-#9297-origin-release-3.1
Automated cherry pick of #9297
2018-03-07 22:03:14 -08:00
087b9aa3dc mvcc: fix watchable store test for 3.2 cherrypick of #9281 2018-03-07 21:20:32 -08:00
b6373f1625 mvcc: restore unsynced watchers
In case syncWatchersLoop() starts before Restore() is called,
watchers already added by that moment are moved to s.synced by the loop.
However, there is a broken logic that moves watchers from s.synced
to s.uncyned without setting keyWatchers of the watcherGroup.
Eventually syncWatchers() fails to pickup those watchers from s.unsynced
and no events are sent to the watchers, because newWatcherBatch() called
in the function uses wg.watcherSetByKey() internally that requires
a proper keyWatchers value.
2018-03-07 21:20:21 -08:00
4178b75411 hack/scripts-dev: fix indentation in run.sh
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-07 14:31:25 -08:00
9c8e39e7f4 hack/scripts-dev: sync with master
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-07 14:25:10 -08:00
af3021aa1a semaphore: update Go version, release test version
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-07 14:24:57 -08:00
df0b652d6a travis: use docker, sync with master
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-07 14:24:04 -08:00
8e5d62cf1e Merge pull request #8933 from gyuho/release-doc
release-3.1: scripts/build-docker: build both gcr.io and quay.io images
2017-11-30 12:49:51 -08:00
212d801294 scripts/build-docker: build both gcr.io and quay.io images
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-28 15:03:38 -08:00
c2d8d9fd26 version: bump up to 3.1.11+git 2017-11-28 14:43:37 -08:00
960f4604bc version: bump up to v3.1.11 2017-11-28 10:48:24 -08:00
22b67da920 Merge pull request #8902 from jpbetz/automated-cherry-pick-of-#8813-release-3.1
Automated cherry pick of #8813 release 3.1
2017-11-22 11:21:13 -08:00
4b53ab0909 test: Clean agent directories on disk before functional test runs, not after
This is primarily so CI tooling can capture the agent logs after the functional tester runs.
2017-11-21 11:41:57 -08:00
b32ec69f9b vendor: Switch from boltdb v1.3.0 to coreos/bbolt v1.3.1-coreos.3 2017-11-21 11:34:45 -08:00
3ab9894b04 Merge pull request #8806 from jpbetz/automated-cherry-pick-of-#8427-origin-release-3.1-1509570173
Automated cherry pick of #8427 to release 3.1 branch
2017-11-09 18:07:40 -08:00
00d6d4aba7 e2e: test against latest v3.1.x release
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-08 15:11:34 -08:00
88b5e22b73 semaphore: manually pin v3.1.10 for release upgrade test
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-08 12:46:15 -08:00
2bb8278fbf hack/scripts-dev: add Makefile, Dockerfile-test
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-06 14:13:51 -08:00
935c76b8c3 semaphore.sh: fail tests with "(--- FAIL:|leak)"
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-03 10:59:23 -07:00
e83f50ec7c mvcc: sending events after restore
Fixes: #8411
2017-11-02 17:15:35 -07:00
232a81d804 client/integration: use only digits in unix port
Fix https://github.com/coreos/etcd/issues/7558.

Same as https://github.com/coreos/etcd/issues/6959.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-02 13:56:55 -07:00
6ffc32cd06 semaphore.sh: add to release-3.1 branch 2017-11-02 13:56:48 -07:00
f8aeb21c2d version: bump up to v3.1.10+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-11-02 13:56:16 -07:00
0520cb9304 version: bump up to 3.1.10
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-07-14 09:37:27 -07:00
424e4ae1cc fixtures: add gencerts.sh, generate CRL 2017-07-14 09:37:15 -07:00
a631a80a39 travis.yml: test with Go 1.8.3
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-07-14 08:51:24 -07:00
fc08fd75ee version: bump up to 3.1.9+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-07-14 08:51:13 -07:00
0f4a535c2f version: bump up to 3.1.9
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-09 10:44:38 -07:00
c765bef483 rafthttp: permit very large v2 snapshots
v2 snapshots were hitting the 512MB message decode limit, causing
sending snapshots to new members to fail for being too big.
2017-06-09 10:44:07 -07:00
5586a5806e version: bump up to 3.1.8+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-09 10:43:34 -07:00
d267ca9c18 version: bump up to 3.1.8
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-05 12:25:40 -07:00
4176fe768f *: fix other broken links in markdown
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 22:27:58 -07:00
950c846144 Documentation/v3: fix broken links
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 22:27:51 -07:00
0b78d66abe Documentation/v2: fix broken links
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 16:59:23 -07:00
2d58079626 integration: close accepted connection on stopc path
Connection pausing added another exit condition in the listener
path, causing the bridge to leak connections instead of closing
them when signalled to close. Also adds some additional Close
paranoia.

Fixes #7823
2017-05-03 14:51:47 -07:00
be171fa424 etcdserver: apply() sets consistIndex for any entry type
previously, apply() doesn't set consistIndex for EntryConfChange type.
this causes a misalignment between consistIndex and applied index
where EntryConfChange entry results setting applied index but not consistIndex.

suppose that addMember() is called and leader reflects that change.
1. applied index and consistIndex is now misaligned.
2. a new follower node joined.
3. leader sends the snapshot to follower
	where the applied index is the snapshot metadata index.
4. follower node saves the snapshot and database(includes consistIndex) from leader.
5. restarting follower loads snapshot and database.
6. follower checks snapshot metadata index(same as applied index) and database consistIndex,
	finds them don't match, and then panic.

FIXES #7834
2017-05-03 09:22:48 -07:00
4b60243fc5 etcd-2-1-0-bench: Fix an absolute bare link to resource outside of Documentation dir 2017-05-03 08:32:23 -07:00
2c5d79f49f Docs: replace absolute links with relative ones. 2017-05-03 08:32:15 -07:00
424abca6ac version: bump up to 3.1.7+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-05-03 08:31:53 -07:00
43b75072bf version: bump up to 3.1.7
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-27 08:25:50 -07:00
78141fae60 clientv3: set current revision to create rev regardless of CreateNotify
Turns out the optimization to ignore setting the init rev for
current revision watches breaks some ordering assumptions. Since
Watch only returns a channel once it gets a response, it should
bind the revision at the time of the first create response.

Was causing TestWatchReconnInit to fail.
2017-04-25 10:54:39 -07:00
3be37f042e integration: add pause/unpause to client bridge
Resetting connections sometimes isn't enough; need to stop/resume
accepting connections for some tests while keeping the member up.
2017-04-25 10:54:15 -07:00
7c896098d2 clientv3/integration: test watch resume with disconnect before first event 2017-04-25 10:53:58 -07:00
30f4e36de4 clientv3: only update initReq.rev == 0 with creation watch revision
Always updating the initReq.rev on watch create will resume from the wrong
revision if initReq is ever nonzero.
2017-04-25 10:53:37 -07:00
557abbe437 ctlv3: use printer for lease command results
Fixes #7783
2017-04-20 10:39:36 -07:00
4b448c209b version: bump up to 3.1.6+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-20 10:39:18 -07:00
e5b7ee2d03 version: bump up to 3.1.6
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-19 08:28:06 -07:00
a4c5731c38 ctlv3: keep lease as integer in fields printer
Output was giving %!d(string=) instead of the expected lease ID
value.
2017-04-19 08:27:52 -07:00
1f558ae678 integration: test auth API response header revision
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 20:06:14 -07:00
df93627bbb etcdserver: fill-in Auth API Header in apply layer
Replacing "etcdserver: fill a response header in auth RPCs"
The revision should be set at the time of "apply",
not in later RPC layer.

Fix https://github.com/coreos/etcd/issues/7691

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 20:06:08 -07:00
a20295c65b auth: fix race on stopping simple token keeper
run goroutine was resetting a field for no reason and without holding a lock.
This patch cleans up the run goroutine management to make the start/stop path
less racey in general.
2017-04-14 16:52:25 -07:00
9f7bb0df3a etcdserver: let Status() not require authentication
The information that can be obtained with the RPC doesn't need to be
protected.

Fix https://github.com/coreos/etcd/issues/7721
2017-04-13 15:56:26 -07:00
6a805e5222 test: do not run extra static checks on release branch
Things are usually already fixed in master branch
but not worth backporting.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-13 14:44:22 -07:00
38f79fa565 clientv3: fix gofmt warnings
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-13 14:44:22 -07:00
37a502cc88 integration: test requests with valid auth token but disabled auth
etcd was crashing since auth was assuming a token implies auth is enabled.
2017-04-13 14:44:22 -07:00
9be7fc5320 auth: protect simpleToken with single mutex and check if enabled
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.

Fixes #7722
2017-04-13 14:44:16 -07:00
288bccd288 pkg/transport: remove port in Certificate.IPAddresses
etcd passes 'url.URL.Host' to 'SelfCert' which contains
client, peer port. 'net.ParseIP("127.0.0.1:2379")' returns
'nil', and the client on this self-cert will see errors
of '127.0.0.1 because it doesn't contain any IP SANs'

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-05 04:30:26 -07:00
8cb5b48f58 clientv3: test dial timeout is respected when using auth 2017-04-04 14:14:23 -07:00
6538217528 clientv3: respect dial timeout when authenticating
Fixes #7627
2017-04-04 14:12:32 -07:00
e983d6b343 version: bump up to 3.1.5+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-04 14:10:15 -07:00
20490caaf0 version: bump up to 3.1.5
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-27 10:20:28 -07:00
e156746959 raft: use rs.req.Entries[0].Data as the key for deletion in advance()
advance() should use rs.req.Entries[0].Data as the context instead of
req.Context for deletion. Since req.Context is never set, there won't be
any context being deleted from pendingReadIndex; results mem leak.

FIXES #7571
2017-03-24 15:51:39 -07:00
d84bf983cc Dockerfile-release: add nsswitch.conf into image
The file '/etc/nsswitch.conf' is created in order to
take in account '/etc/hosts' entries while resolving
domain names.
2017-03-23 15:20:49 -07:00
b44c6bff9d clientv3: use waitgroup to wait for substream goroutine teardown
When a grpc watch stream is torn down, it will join on its logical substream
goroutines by waiting for each to close a channel. This doesn't guarantee
the substream is fully exited, though, but only about to exit and can be
waiting to resume even after Watch.Close finishes. Instead, use a
waitgroup.Done at the very end of the substream defer.

Fixes #7573
2017-03-23 12:26:32 -07:00
8c3c1b4a9c *: use filepath.Join for files 2017-03-23 09:53:56 -07:00
b478387a59 wal: use path/filepath instead of path
Use the path/filepath package instead of the path package. The
path package assumes slash-separated paths, which doesn't work
on Windows. But path/filepath manipulates filename paths in a way
that's compatible across OSes.
2017-03-23 09:50:41 -07:00
dfc1f21f9d version: bump to 3.1.4+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-23 09:49:51 -07:00
41e52ebc22 version: bump to 3.1.4
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-22 09:46:23 -07:00
7bb538d4d4 backend: add FillPercent option 2017-03-21 12:12:32 -07:00
1622782e49 integration: ensure 'StopNotify' on publish error
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-21 12:12:13 -07:00
99b47e0c1e etcdmain: handle StopNotify when ErrStopped aborted publish
Fix https://github.com/coreos/etcd/issues/7512.

If a server starts and aborts due to config error,
it is possible to get stuck in ReadyNotify waits.
This adds select case to get notified on stop channel.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-21 12:10:36 -07:00
350d0cd211 ctlv3: have "protobuf" in output help string instead of "proto"
Fixes #7538
2017-03-20 12:40:25 -07:00
72f37ff79a embed: Clear default initial cluster
NewConfig() should sets initial cluster from name but we should clear it
in the event that another discovery option has been specified.

Fixes #7516
2017-03-18 07:56:18 -07:00
3221454cab etcdserver: remove possibly compacted entry look-up
Fix https://github.com/coreos/etcd/issues/7470.

This patch removes unnecessary term look-up in
'createMergedSnapshotMessage', which can trigger panic
if raft entry at etcdProgress.appliedi got compacted
by subsequent 'MsgSnap' messages--if a follower is
being (in this case, network latency spikes) slow, it
could receive subsequent 'MsgSnap' requests from leader.

etcd server-side 'applyAll' routine and raft's Ready
processing routine becomes asynchronous after raft
entries are persisted. And given that raft Ready routine
takes less time to finish, it is possible that second
'MsgSnap' is being handled, while the slow 'applyAll'
is still processing the first(old) 'MsgSnap'. Then raft
Ready routine can compact the log entries at future
index to 'applyAll'. That is how 'createMergedSnapshotMessage'
tried to look up raft term with outdated etcdProgress.appliedi.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-18 07:56:18 -07:00
4a1bffdbc6 clientv3: close open watch channel if substream is closing on reconnect
If substream is closing but outc is still open while reconnecting, then outc
would only be closed once the watch client would connect or once the watch
client is closed. This was leading to deadlocks in the proxy tests. Instead,
close immediately if the context is canceled.

Fixes #7503
2017-03-18 07:56:18 -07:00
9d9be2bc86 ctlv3: ensure synced member list before printing env vars on member add
In cases of multiple endpoints, it's possible member add would get a its
member list from a member that has not yet recognized the membership
update. Instead, confirm that the member list response is from the
member that acked the member add or from a member that has synced
with the cluster following the member add.

Fixes #7498
2017-03-18 07:56:18 -07:00
e5462f74f1 auth: get rid of deadlocking channel passing scheme in simpleTokenTTL
Cherry-picked from 1b1fabef8f.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-18 07:56:05 -07:00
c68c1d9344 discovery: fix print format
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-17 14:21:57 -07:00
6ed56cd723 auth: nil check AuthInfo when checking admin permissions
If the context does not include auth information, get authinfo will
return a nil auth info and a nil error. This is then passed to
IsAdminPermitted, which would dereference the nil auth info.
2017-03-17 14:21:39 -07:00
a3c6f6bf81 version: bump up to 3.1.3+git
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-17 14:21:15 -07:00
21fdcc6443 version: bump up to 3.1.3
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-10 09:05:16 -08:00
8d122e7011 etcdmain: SdNotify when gateway, grpc-proxy are ready
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-09 11:35:20 -08:00
ade1d97893 lease: guard 'Lease.itemSet' from concurrent writes
Fix https://github.com/coreos/etcd/issues/7448.

Affected if etcd builds with Go 1.8+.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-08 14:50:06 -08:00
1300189581 gateway: fix the dns discovery method
strip the scheme from the endpoints to have a clean hostname for TCP proxy

Fixes #7452
2017-03-08 14:49:50 -08:00
1971517806 etcdctl: correctly batch revisions in make-mirror
Fixes #7410
2017-03-06 14:55:47 -08:00
d614bb0799 etcdmain: log machine default host after update check
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-06 14:55:31 -08:00
059dc91d4c embed: use machine default host only for default value, 0.0.0.0
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-06 14:55:24 -08:00
5fdbaee761 version: bump up to 3.1.2+git 2017-02-24 10:34:53 -08:00
714e7ec8db version: bump up to 3.1.2 2017-02-22 10:45:48 -08:00
2cdaf6d661 netutil: use ipv4 host by default
Was non-deterministic.
2017-02-22 10:45:38 -08:00
77a51e0dbf pkg/netutil: name GetDefaultInterfaces consistent 2017-02-22 10:45:29 -08:00
d96d3aa0ed netutil: add dualstack to linux_route
in v3.1.0 netutil couldn't get default interface for ipv6only hosts

Fixes #7219
2017-02-22 10:45:19 -08:00
66e7532f57 pkg/netutil: use native byte ordering for route information
Fixes #7199
2017-02-22 10:45:07 -08:00
3eff360e79 pkg/cpuutil: add cpuutil
A package for unsafe cpu-ish things.
2017-02-22 10:44:59 -08:00
1487071966 integration: add 'TestV3HashRestart' 2017-02-22 10:39:49 -08:00
5d62bba9c7 auth: keep old revision in 'NewAuthStore'
When there's no changes yet (right after auth
store initialization), we should commit old revision.

Fix https://github.com/coreos/etcd/issues/7359.
2017-02-22 10:39:28 -08:00
114e293119 integration: test keepalives for short TTLs 2017-02-22 10:38:38 -08:00
1439955536 clientv3: do not set next keepalive time <= now+TTL 2017-02-22 10:38:28 -08:00
2c8ecc7e13 tcpproxy: don't use range variable in reactivate goroutine
Ends up trying to reactivate only the last endpoint.
2017-02-22 10:38:19 -08:00
7b4d622a7e raft: fix read index request for #7331 2017-02-22 10:38:09 -08:00
db8abbd975 version: bump to 3.1.1+git 2017-02-21 17:00:02 -08:00
ac1c7eba21 version: bump up to 3.1.1 2017-02-16 13:53:25 -08:00
9cc6d4852a travis: update for Go 1.7.5 tests 2017-02-16 13:53:05 -08:00
ff7fa9843d clientv3: fix lease keepalive duration 2017-02-16 13:52:27 -08:00
f66138d403 clientv3: fix lease keepalive duration 2017-02-16 12:41:09 -08:00
8c87916f68 auth: add 'setupAuthStore' to tests 2017-02-14 14:39:51 -08:00
9e81b002c4 auth: correct initialization in NewAuthStore()
Because of my own silly mistake, current NewAuthStore() doesn't
initialize authStore in a correct manner. For example, after recovery
from snapshot, it cannot revive the flag of enabled/disabled. This
commit fixes the problem.

Fix https://github.com/coreos/etcd/issues/7165
2017-02-14 13:49:19 -08:00
4962c5cff7 auth: add a test case for recoverying from snapshot
Conflicts:
	auth/store_test.go
2017-02-14 13:48:17 -08:00
e5bf25a3b6 e2e: add cases for defrag and snapshot with authentication 2017-02-14 13:43:11 -08:00
98c60e8faa auth, etcdserver: let maintenance services require root role
This commit lets maintenance services require root privilege. It also
moves AuthInfoFromCtx() from etcdserver to auth pkg for cleaning purpose.
2017-02-14 13:42:31 -08:00
3ac3fa6f3d travis: disable email notifications
Was spamming security@coreos.com
2017-02-14 12:55:02 -08:00
eaa8b9e155 clientv3: test closing client cancels blocking dials 2017-02-14 11:31:51 -08:00
ea2aae464d clientv3: use DialContext
Fixes #7216
2017-02-14 11:31:43 -08:00
776739ebc2 roadmap: update roadmap 2017-01-20 14:21:08 -08:00
a7a8a47ba0 README: remove ACI, update Go version 2017-01-20 14:21:00 -08:00
379f7ae10e op-guide: change grpc-proxy from 'pre' to alpha' 2017-01-20 13:25:21 -08:00
ead2d95914 version: bump to v3.1.0+git 2017-01-20 13:25:04 -08:00
8ba2897a21 version: bump to v3.1.0 2017-01-20 12:42:12 -08:00
bc31e27cb9 documentation: update build documentation 2017-01-20 11:20:54 -08:00
fce20a0b0b test: passed the test script arguments as the test function parameters 2017-01-20 11:15:01 -08:00
f10363fecd etcdctlv3: snapshot restore works with lease key 2017-01-20 10:06:49 -08:00
a7ec6c88fd pkg/flags: fixed prefix checking of the env variables 2017-01-20 09:57:00 -08:00
62c591d223 integration: test STM apply on concurrent deletion 2017-01-20 09:56:42 -08:00
5676226867 clientv3/concurrency: fix rev comparison on concurrent key deletion 2017-01-20 09:56:36 -08:00
898b9e608f Documentation: fix typo s/endpoint-health/endpoint health/ 2017-01-19 20:48:01 -08:00
b84be6b11b NEWS: fix date for v3.1 release 2017-01-19 17:56:35 -08:00
7a12d65528 Documentation: update experimental_apis for v3.1 release 2017-01-19 12:43:32 -08:00
53ac04b118 vendor: update 'golang.org/x/net' 2017-01-18 13:11:48 -08:00
fbcd5375b3 glide: update 'golang.org/x/net' 2017-01-18 13:11:13 -08:00
c2e8d06eec grpcproxy, etcdmain, integration: add close channel to kv proxy
ccache launches goroutines that need to be explicitly stopped.

Fixes #7158
2017-01-18 13:11:03 -08:00
6c8f1986c8 etcdserver: use ReqTimeout for linearized read
Fixes #7136
2017-01-17 17:31:31 -08:00
be9ae300c6 pkg/report: add nil checking for getTimeSeries 2017-01-17 13:23:57 -08:00
9ba3632614 Documentation: document upgrading to v3.1 2017-01-17 13:23:19 -08:00
c2d8b5a9e8 ctlv3: print cluster info after adding new member 2017-01-17 10:23:02 -08:00
465 changed files with 16002 additions and 3426 deletions

View File

@ -1,61 +1,97 @@
dist: trusty
language: go
go_import_path: github.com/coreos/etcd
sudo: false
sudo: required
services: docker
go:
- 1.7.4
- tip
- 1.8.7
notifications:
on_success: never
on_failure: never
env:
matrix:
- TARGET=amd64
- TARGET=arm64
- TARGET=arm
- TARGET=386
- TARGET=ppc64le
- TARGET=linux-amd64-build
- TARGET=linux-amd64-unit
- TARGET=linux-amd64-integration
- TARGET=linux-amd64-functional
- TARGET=linux-386-build
- TARGET=linux-386-unit
- TARGET=darwin-amd64-build
- TARGET=windows-amd64-build
- TARGET=linux-arm-build
- TARGET=linux-arm64-build
- TARGET=linux-ppc64le-build
matrix:
fast_finish: true
allow_failures:
- go: tip
exclude:
- go: tip
env: TARGET=arm
- go: tip
env: TARGET=arm64
- go: tip
env: TARGET=386
- go: tip
env: TARGET=ppc64le
addons:
apt:
packages:
- libpcap-dev
- libaspell-dev
- libhunspell-dev
before_install:
- go get -v github.com/chzchzchz/goword
- go get -v honnef.co/go/simple/cmd/gosimple
- go get -v honnef.co/go/unused/cmd/unused
- if [[ $TRAVIS_GO_VERSION == 1.* ]]; then docker pull gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION}; fi
# disable godep restore override
install:
- pushd cmd/etcd && go get -t -v ./... && popd
- pushd cmd/etcd && go get -t -v ./... && popd
script:
- echo "TRAVIS_GO_VERSION=${TRAVIS_GO_VERSION}"
- >
case "${TARGET}" in
amd64)
GOARCH=amd64 ./test
linux-amd64-build)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=amd64 PASSES='build' ./test"
;;
386)
GOARCH=386 PASSES="build unit" ./test
linux-amd64-unit)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=amd64 PASSES='unit' ./test"
;;
*)
# test building out of gopath
GO_BUILD_FLAGS="-a -v" GOPATH="" GOARCH="${TARGET}" ./build
linux-amd64-integration)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=amd64 PASSES='integration' ./test"
;;
linux-amd64-functional)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "./build && GOARCH=amd64 PASSES='functional' ./test"
;;
linux-386-build)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=386 PASSES='build' ./test"
;;
linux-386-unit)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GOARCH=386 PASSES='unit' ./test"
;;
darwin-amd64-build)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GO_BUILD_FLAGS='-v' GOOS=darwin GOARCH=amd64 ./build"
;;
windows-amd64-build)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GO_BUILD_FLAGS='-v' GOOS=windows GOARCH=amd64 ./build"
;;
linux-arm-build)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GO_BUILD_FLAGS='-v' GOARCH=arm ./build"
;;
linux-arm64-build)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GO_BUILD_FLAGS='-v' GOARCH=arm64 ./build"
;;
linux-ppc64le-build)
docker run --rm \
--volume=`pwd`:/go/src/github.com/coreos/etcd gcr.io/etcd-development/etcd-test:go${TRAVIS_GO_VERSION} \
/bin/bash -c "GO_BUILD_FLAGS='-v' GOARCH=ppc64le ./build"
;;
esac

View File

@ -5,6 +5,12 @@ ADD etcdctl /usr/local/bin/
RUN mkdir -p /var/etcd/
RUN mkdir -p /var/lib/etcd/
# Alpine Linux doesn't use pam, which means that there is no /etc/nsswitch.conf,
# but Golang relies on /etc/nsswitch.conf to check the order of DNS resolving
# (see https://github.com/golang/go/commit/9dee7771f561cf6aee081c0af6658cc81fac3918)
# To fix this we just create /etc/nsswitch.conf and add the following line:
RUN echo 'hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4' >> /etc/nsswitch.conf
EXPOSE 2379 2380
# Define default command.

57
Dockerfile-test Normal file
View File

@ -0,0 +1,57 @@
FROM ubuntu:16.10
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
RUN apt-get -y update \
&& apt-get -y install \
build-essential \
gcc \
apt-utils \
pkg-config \
software-properties-common \
apt-transport-https \
libssl-dev \
sudo \
bash \
curl \
wget \
tar \
git \
netcat \
libaspell-dev \
libhunspell-dev \
hunspell-en-us \
aspell-en \
shellcheck \
&& apt-get -y update \
&& apt-get -y upgrade \
&& apt-get -y autoremove \
&& apt-get -y autoclean
ENV GOROOT /usr/local/go
ENV GOPATH /go
ENV PATH ${GOPATH}/bin:${GOROOT}/bin:${PATH}
ENV GO_VERSION REPLACE_ME_GO_VERSION
ENV GO_DOWNLOAD_URL https://storage.googleapis.com/golang
RUN rm -rf ${GOROOT} \
&& curl -s ${GO_DOWNLOAD_URL}/go${GO_VERSION}.linux-amd64.tar.gz | tar -v -C /usr/local/ -xz \
&& mkdir -p ${GOPATH}/src ${GOPATH}/bin \
&& go version
RUN mkdir -p ${GOPATH}/src/github.com/coreos/etcd
WORKDIR ${GOPATH}/src/github.com/coreos/etcd
ADD ./scripts/install-marker.sh /tmp/install-marker.sh
RUN go get -v -u -tags spell github.com/chzchzchz/goword \
&& go get -v -u github.com/coreos/license-bill-of-materials \
&& go get -v -u honnef.co/go/tools/cmd/gosimple \
&& go get -v -u honnef.co/go/tools/cmd/unused \
&& go get -v -u honnef.co/go/tools/cmd/staticcheck \
&& go get -v -u github.com/wadey/gocovmerge \
&& go get -v -u github.com/gordonklaus/ineffassign \
&& /tmp/install-marker.sh amd64 \
&& rm -f /tmp/install-marker.sh \
&& curl -s https://codecov.io/bash >/codecov \
&& chmod 700 /codecov

View File

@ -49,4 +49,4 @@ Bootstrap another machine and use the [hey HTTP benchmark tool][hey] to send req
| 256 | 256 | all servers | 3061 | 119.3 |
[hey]: https://github.com/rakyll/hey
[hack-benchmark]: /hack/benchmark/
[hack-benchmark]: https://github.com/coreos/etcd/tree/master/hack/benchmark

View File

@ -69,4 +69,4 @@ Bootstrap another machine and use the [hey HTTP benchmark tool][hey] to send req
[hey]: https://github.com/rakyll/hey
[c7146bd5]: https://github.com/coreos/etcd/commits/c7146bd5f2c73716091262edc638401bb8229144
[etcd-2.1-benchmark]: etcd-2-1-0-alpha-benchmarks.md
[hack-benchmark]: /hack/benchmark/
[hack-benchmark]: ../../hack/benchmark/

View File

@ -39,4 +39,4 @@ The performance is nearly the same as the one with empty server handler.
The performance with empty server handler is not affected by one put. So the
performance downgrade should be caused by storage package.
[etcd-v3-benchmark]: /tools/benchmark/
[etcd-v3-benchmark]: ../../tools/benchmark/

View File

@ -1,8 +1,11 @@
# Experimental APIs and features
For the most part, the etcd project is stable, but we are still moving fast! We believe in the release fast philosophy. We want to get early feedback on features still in development and stabilizing. Thus, there are, and will be more, experimental features and APIs. We plan to improve these features based on the early feedback from the community, or abandon them if there is little interest, in the next few releases. If you are running a production system, please do not rely on any experimental features or APIs.
For the most part, the etcd project is stable, but we are still moving fast! We believe in the release fast philosophy. We want to get early feedback on features still in development and stabilizing. Thus, there are, and will be more, experimental features and APIs. We plan to improve these features based on the early feedback from the community, or abandon them if there is little interest, in the next few releases. Please do not rely on any experimental features or APIs in production environment.
## The current experimental API/features are:
- v3 auth API: expect to be stable in 3.1 release
- etcd gateway: expect to be stable in 3.1 release
- [gateway][gateway]: beta, to be stable in 3.2 release
- [gRPC proxy][grpc-proxy]: alpha, to be stable in 3.2 release
[gateway]: ../op-guide/gateway.md
[grpc-proxy]: ../op-guide/grpc_proxy.md

View File

@ -3,7 +3,7 @@
etcd uses the [capnslog][capnslog] library for logging application output categorized into *levels*. A log message's level is determined according to these conventions:
* Error: Data has been lost, a request has failed for a bad reason, or a required resource has been lost
* Examples:
* Examples:
* A failure to allocate disk space for WAL
* Warning: (Hopefully) Temporary conditions that may cause errors, but may work fine. A replica disappearing (that may reconnect) is a warning.
@ -26,4 +26,4 @@ etcd uses the [capnslog][capnslog] library for logging application output catego
* Send a normal message to a remote peer
* Write a log entry to disk
[capnslog]: [https://github.com/coreos/pkg/tree/master/capnslog]
[capnslog]: https://github.com/coreos/pkg/tree/master/capnslog

View File

@ -10,29 +10,44 @@ The easiest way to get etcd is to use one of the pre-built release binaries whic
## Build the latest version
For those wanting to try the very latest version, build etcd from the `master` branch.
[Go](https://golang.org/) version 1.6+ (with HTTP2 support) is required to build the latest version of etcd.
etcd vendors its dependency for official release binaries, while making vendoring optional to avoid import conflicts.
[`build` script][build-script] would automatically include the vendored dependencies from [`cmd`][cmd-directory] directory.
For those wanting to try the very latest version, build etcd from the `master` branch. [Go](https://golang.org/) version 1.7+ is required to build the latest version of etcd. To ensure etcd is built against well-tested libraries, etcd vendors its dependencies for official release binaries. However, etcd's vendoring is also optional to avoid potential import conflicts when embedding the etcd server or using the etcd client.
Here are the commands to build an etcd binary from the `master` branch:
First, confirm go 1.7+ is installed:
```
```sh
# go is required
$ go version
go version go1.6 darwin/amd64
go version go1.7.3 darwin/amd64
# GOPATH should be set correctly
$ echo $GOPATH
/Users/example/go
```
$ mkdir -p $GOPATH/src/github.com/coreos
$ cd $GOPATH/src/github.com/coreos
To build `etcd` from the `master` branch without a `GOPATH` using the official `build` script:
```sh
$ git clone https://github.com/coreos/etcd.git
$ cd etcd
$ ./build
$ ./bin/etcd
...
```
To build a vendored `etcd` from the `master` branch via `go get`:
```sh
# GOPATH should be set
$ echo $GOPATH
/Users/example/go
$ go get github.com/coreos/etcd/cmd/etcd
$ $GOPATH/bin/etcd
```
To build `etcd` from the `master` branch without vendoring (may not build due to upstream conflicts):
```sh
# GOPATH should be set
$ echo $GOPATH
/Users/example/go
$ go get github.com/coreos/etcd
$ $GOPATH/bin/etcd
```
## Test the installation

View File

@ -52,6 +52,7 @@ To learn more about the concepts and internals behind etcd, read the following p
- [Migrate applications from using API v2 to API v3][v2_migration]
- [Updating v2.3 to v3.0][v3_upgrade]
- [Updating v3.0 to v3.1][v31_upgrade]
## Frequently Asked Questions (FAQ)
@ -88,3 +89,4 @@ Answers to [common questions] about etcd.
[supported_platform]: op-guide/supported-platform.md
[experimental]: dev-guide/experimental_apis.md
[v3_upgrade]: upgrades/upgrade_3_0.md
[v31_upgrade]: upgrades/upgrade_3_1.md

View File

@ -475,5 +475,5 @@ To setup an etcd cluster with proxies of v2 API, please read the the [clustering
[proxy]: https://github.com/coreos/etcd/blob/release-2.3/Documentation/proxy.md
[clustering_etcd2]: https://github.com/coreos/etcd/blob/release-2.3/Documentation/clustering.md
[security-guide]: security.md
[tls-setup]: /hack/tls-setup
[tls-setup]: ../../hack/tls-setup
[gateway]: gateway.md

View File

@ -286,7 +286,7 @@ Follow the instructions when using these flags.
[build-cluster]: clustering.md#static
[reconfig]: runtime-configuration.md
[discovery]: clustering.md#discovery
[iana-ports]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=etcd
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
[proxy]: ../v2/proxy.md
[restore]: ../v2/admin_guide.md#restoring-a-backup
[security]: security.md

View File

@ -57,7 +57,7 @@ sudo rkt run --net=default:IP=${NODE3} coreos.com/etcd:v3.0.6 -- -name=node3 -ad
Verify the cluster is healthy and can be reached.
```
ETCDCTL_API=3 etcdctl --endpoints=http://172.16.28.21:2379,http://172.16.28.22:2379,http://172.16.28.23:2379 endpoint-health
ETCDCTL_API=3 etcdctl --endpoints=http://172.16.28.21:2379,http://172.16.28.22:2379,http://172.16.28.23:2379 endpoint health
```
### DNS

View File

@ -1,6 +1,6 @@
# gRPC proxy
*This is a pre-alpha feature, we are looking for early feedback.*
*This is an alpha feature, we are looking for early feedback.*
The gRPC proxy is a stateless etcd reverse proxy operating at the gRPC layer (L7). The proxy is designed to reduce the total processing load on the core etcd cluster. For horizontal scalability, it coalesces watch and lease API requests. To protect the cluster against abusive clients, it caches key range requests.
@ -36,9 +36,9 @@ watch key A ^ ^ watch key A |
To effectively coalesce multiple client watchers into a single watcher, the gRPC proxy coalesces new `c-watchers` into an existing `s-watcher` when possible. This coalesced `s-watcher` may be out of sync with the etcd server due to network delays or buffered undelivered events. When the watch revision is unspecified, the gRPC proxy will not guarantee the `c-watcher` will start watching from the most recent store revision. For example, if a client watches from an etcd server with revision 1000, that watcher will begin at revision 1000. If a client watches from the gRPC proxy, may begin watching from revision 990.
Similar limitations apply to cancellation. When the watcher is cancelled, the etcd servers revision may be greater than the cancellation response revision.
Similar limitations apply to cancellation. When the watcher is cancelled, the etcd servers revision may be greater than the cancellation response revision.
These two limitations should not cause problems for most use cases. In the future, there may be additional options to force the watcher to bypass the gRPC proxy for more accurate revision responses.
These two limitations should not cause problems for most use cases. In the future, there may be additional options to force the watcher to bypass the gRPC proxy for more accurate revision responses.
## Scalable lease API
@ -75,3 +75,4 @@ $ ETCDCTL_API=3 ./etcdctl --endpoints=127.0.0.1:2379 get foo
foo
bar
```

View File

@ -219,6 +219,6 @@ Make sure to sign the certificates with a Subject Name the member's public IP ad
The certificate needs to be signed for the member's FQDN in its Subject Name, use Subject Alternative Names (short IP SANs) to add the IP address. The `etcd-ca` tool provides `--domain=` option for its `new-cert` command, and openssl can make [it][alt-name] too.
[cfssl]: https://github.com/cloudflare/cfssl
[tls-setup]: /hack/tls-setup
[tls-setup]: ../../hack/tls-setup
[tls-guide]: https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md
[alt-name]: http://wiki.cacert.org/FAQ/subjectAltName

View File

@ -50,7 +50,7 @@ Radius Intelligence uses Kubernetes running CoreOS to containerize and scale int
## Vonage
- *Application*: system configuration for microservices, scheduling, locks (future - service discovery)
- *Application*: system configuration for microservices, scheduling, locks (future - service discovery)
- *Launched*: August 2015
- *Cluster Size*: 2 clusters of 5 members in 2 DCs, n local proxies 1-to-1 with microservice, (ssl and SRV look up)
- *Order of Data Size*: kilobytes
@ -60,3 +60,148 @@ Radius Intelligence uses Kubernetes running CoreOS to containerize and scale int
[teamcity]: https://www.jetbrains.com/teamcity/
[raoofm]:https://github.com/raoofm
## Qiniu Cloud
- *Application*: system configuration for microservices, distributed locks
- *Launched*: Jan. 2016
- *Cluster Size*: 3 members each with several clusters
- *Order of Data Size*: kilobytes
- *Operator*: Pandora, chenchao@qiniu.com
- *Environment*: Baremetal
- *Backups*: None, all data can be recreated if necessary
## QingCloud
- *Application*: [QingCloud][qingcloud] appcenter cluster for service discovery as [metad][metad] backend.
- *Launched*: December 2016
- *Cluster Size*: 1 cluster of 3 members per user.
- *Order of Data Size*: kilobytes
- *Operator*: [yunify][yunify]
- *Environment*: QingCloud IaaS
- *Backups*: None, all data can be recreated if necessary.
[metad]:https://github.com/yunify/metad
[yunify]:https://github.com/yunify
[qingcloud]:https://qingcloud.com/
## Yandex
- *Application*: system configuration for services, service discovery
- *Launched*: March 2016
- *Cluster Size*: 3 clusters of 5 members
- *Order of Data Size*: several gigabytes
- *Operator*: Yandex; [nekto0n][nekto0n]
- *Environment*: Bare Metal
- *Backups*: None
[nekto0n]:https://github.com/nekto0n
## Tencent Games
- *Application*: Meta data and configuration data for service discovery, Kubernetes, etc.
- *Launched*: Jan. 2015
- *Cluster Size*: 3 members each with 10s of clusters
- *Order of Data Size*: 10s of Megabytes
- *Operator*: Tencent Game Operations Department
- *Environment*: Baremetal
- *Backups*: Periodic sync to backup server
In Tencent games, we use Docker and Kubernetes to deploy and run our applications, and use etcd to save meta data for service discovery, Kubernetes, etc.
## Hyper.sh
- *Application*: Kubernetes, distributed locks, etc.
- *Launched*: April 2016
- *Cluster Size*: 1 cluster of 3 members
- *Order of Data Size*: 10s of MB
- *Operator*: Hyper.sh
- *Environment*: Baremetal
- *Backups*: None, all data can be recreated if necessary.
In [hyper.sh][hyper.sh], the container service is backed by [hypernetes][hypernetes], a multi-tenant kubernetes distro. Moreover, we use etcd to coordinate the multiple manage services and store global meta data.
[hypernetes]:https://github.com/hyperhq/hypernetes
[Hyper.sh]:https://www.hyper.sh
## Meitu
- *Application*: system configuration for services, service discovery, kubernetes in test environment
- *Launched*: October 2015
- *Cluster Size*: 1 cluster of 3 members
- *Order of Data Size*: megabytes
- *Operator*: Meitu, hxj@meitu.com, [shafreeck][shafreeck]
- *Environment*: Bare Metal
- *Backups*: None, all data can be recreated if necessary.
[shafreeck]:https://github.com/shafreeck
## Grab
- *Application*: system configuration for services, service discovery
- *Launched*: June 2016
- *Cluster Size*: 1 cluster of 7 members
- *Order of Data Size*: megabytes
- *Operator*: Grab, [taxitan][taxitan], [reterVision][reterVision]
- *Environment*: AWS
- *Backups*: None, all data can be recreated if necessary.
[taxitan]:https://github.com/taxitan
[reterVision]:https://github.com/reterVision
## DaoCloud.io
- *Application*: container management
- *Launched*: Sep. 2015
- *Cluster Size*: 1000+ deployments, each deployment contains a 3 node cluster.
- *Order of Data Size*: 100s of Megabytes
- *Operator*: daocloud.io
- *Environment*: Baremetal and virtual machines
- *Backups*: None, all data can be recreated if necessary.
In [DaoCloud][DaoCloud], we use Docker and Swarm to deploy and run our applications, and we use etcd to save metadata for service discovery.
[DaoCloud]:https://www.daocloud.io
## Branch.io
- *Application*: Kubernetes
- *Launched*: April 2016
- *Cluster Size*: Multiple clusters, multiple sizes
- *Order of Data Size*: 100s of Megabytes
- *Operator*: branch.io
- *Environment*: AWS, Kubernetes
- *Backups*: EBS volume backups
At [Branch][branch], we use kubernetes heavily as our core microservice platform for staging and production.
[branch]: https://branch.io
## Baidu Waimai
- *Application*: SkyDNS, Kubernetes, UDC, CMDB and other distributed systems
- *Launched*: April. 2016
- *Cluster Size*: 3 clusters of 5 members
- *Order of Data Size*: several gigabytes
- *Operator*: Baidu Waimai Operations Department
- *Environment*: CentOS 6.5
- *Backups*: backup scripts
## Salesforce.com
- *Application*: Kubernetes
- *Launched*: Jan 2017
- *Cluster Size*: Multiple clusters of 3 members
- *Order of Data Size*: 100s of Megabytes
- *Operator*: Salesforce.com (krmayankk@github)
- *Environment*: BareMetal
- *Backups*: None, all data can be recreated
## Hosted Graphite
- *Application*: Service discovery, locking, ephemeral application data
- *Launched*: January 2017
- *Cluster Size*: 2 clusters of 7 members
- *Order of Data Size*: Megabytes
- *Operator*: Hosted Graphite (sre@hostedgraphite.com)
- *Environment*: Bare Metal
- *Backups*: None, all data is considered ephemeral.

View File

@ -1,6 +1,6 @@
# Reporting bugs
If any part of the etcd project has bugs or documentation mistakes, please let us know by [opening an issue][issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
If any part of the etcd project has bugs or documentation mistakes, please let us know by [opening an issue][etcd-issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
To make the bug report accurate and easy to understand, please try to create bug reports that are:

View File

@ -6,27 +6,29 @@ In the general case, upgrading from etcd 2.3 to 3.0 can be a zero-downtime, roll
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade Checklists
### Upgrade checklists
#### Upgrade Requirements
**NOTE:** When [migrating from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
To upgrade an existing etcd deployment to 3.0, the running cluster must be 2.3 or greater. If it's before 2.3, please upgrade to [2.3](https://github.com/coreos/etcd/releases/tag/v2.3.0) before upgrading to 3.0.
#### Upgrade requirements
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. You can check the health of the cluster by using the `etcdctl cluster-health` command.
To upgrade an existing etcd deployment to 3.0, the running cluster must be 2.3 or greater. If it's before 2.3, please upgrade to [2.3](https://github.com/coreos/etcd/releases/tag/v2.3.8) before upgrading to 3.0.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl cluster-health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data directory](../v2/admin_guide.md#backing-up-the-datastore). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version.
Before beginning, [backup the etcd data directory](../v2/admin_guide.md#backing-up-the-datastore). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version.
#### Mixed Versions
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.0. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
It might take up to 2 minutes for the newly upgraded member to catch up with the existing cluster when the total data size is larger than 50MB. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
It might take up to 2 minutes for the newly upgraded member to catch up with the existing cluster when the total data size is larger than 50MB. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and well be happy to provide advice on the procedure.
@ -36,13 +38,13 @@ If all members have been upgraded to v3.0, the cluster will be upgraded to v3.0,
Please [backup the data directory](../v2/admin_guide.md#backing-up-the-datastore) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade Procedure
### Upgrade procedure
This example details the upgrade of a three-member v2.3 ectd cluster running on a local machine.
This example details the upgrade of a three-member v2.3 ectd cluster running on a local machine.
#### 1. Check upgrade requirements.
Is the the cluster healthy and running v.2.3.x?
Is the cluster healthy and running v.2.3.x?
```
$ etcdctl cluster-health
@ -52,7 +54,7 @@ member 8211f1d0f64f3269 is healthy: got healthy result from http://localhost:123
cluster is healthy
$ curl http://localhost:2379/version
{"etcdserver":"2.3.x","etcdcluster":"2.3.0"}
{"etcdserver":"2.3.x","etcdcluster":"2.3.8"}
```
#### 2. Stop the existing etcd process
@ -64,7 +66,7 @@ When each etcd process is stopped, expected errors will be logged by other clust
2016-06-27 15:21:48.624175 I | rafthttp: the connection with 8211f1d0f64f3269 became inactive
```
Its a good idea at this point to [backup the etcd data directory](../v2/admin_guide.md#backing-up-the-datastore) to provide a downgrade path should any problems occur:
Its a good idea at this point to [backup the etcd data directory](../v2/admin_guide.md#backing-up-the-datastore) to provide a downgrade path should any problems occur:
```
$ etcdctl backup \
@ -102,7 +104,7 @@ Upgraded members will log warnings like the following until the entire cluster i
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.0 successfully:
When all members are upgraded, the cluster will report upgrading to 3.0 successfully:
```
2016-06-27 15:22:19.873751 N | membership: updated the cluster version from 2.3 to 3.0
@ -116,4 +118,14 @@ $ ETCDCTL_API=3 etcdctl endpoint health
127.0.0.1:22379 is healthy: successfully committed proposal: took = 18.513301ms
```
## Further considerations
- etcdctl environment variables have been updated. If `ETCDCTL_API=2 etcdctl cluster-health` works properly but `ETCDCTL_API=3 etcdctl endpoints health` responds with `Error: grpc: timed out when dialing`, be sure to use the [new variable names](https://github.com/coreos/etcd/tree/master/etcdctl#etcdctl).
## Known Issues
- etcd &lt; v3.1 does not work properly if built with Go &gt; v1.7. See [Issue 6951](https://github.com/coreos/etcd/issues/6951) for additional information.
- If an error such as `transport: http2Client.notifyError got notified that the client transport was broken unexpected EOF.` shows up in the etcd server logs, be sure etcd is a pre-built release or built with (etcd v3.1+ &amp; go v1.7+) or (etcd &lt;v3.1 &amp; go v1.6.x).
- Adding a v3 node to v2.3 cluster during upgrades is not supported and could trigger panics. See [Issue 7249](https://github.com/coreos/etcd/issues/7429) for additional information. Mixed versions of etcd members are only allowed during v3 migration. Finish upgrades before making any membership changes.
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,134 @@
## Upgrade etcd from 3.0 to 3.1
In the general case, upgrading from etcd 3.0 to 3.1 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v3.0 processes and replace them with etcd v3.1 processes
- after running all v3.1 processes, new features in v3.1 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade checklists
**NOTE:** When [migrating from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
#### Monitoring
Following metrics from v3.0.x have been deprecated in favor of [go-grpc-prometheus](https://github.com/grpc-ecosystem/go-grpc-prometheus):
- `etcd_grpc_requests_total`
- `etcd_grpc_requests_failed_total`
- `etcd_grpc_active_streams`
- `etcd_grpc_unary_requests_duration_seconds`
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.1, the running cluster must be 3.0 or greater. If it's before 3.0, please [upgrade to 3.0](upgrade_3_0.md) before upgrading to 3.1.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl endpoint health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data](../op-guide/maintenance.md#snapshot-backup). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version. Please note that the `snapshot` command only backs up the v3 data. For v2 data, see [backing up v2 datastore](../v2/admin_guide.md#backing-up-the-datastore).
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.1. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
If the cluster is serving a v2 data set larger than 50MB, each newly upgraded member may take up to two minutes to catch up with the existing cluster. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and we'll be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.1, the cluster will be upgraded to v3.1, and downgrade from this completed state is **not possible**. If any single member is still v3.0, however, the cluster and its operations remains "v3.0", and it is possible from this mixed cluster state to return to using a v3.0 etcd binary on all members.
Please [backup the data directory](../op-guide/maintenance.md#snapshot-backup) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example shows how to upgrade a 3-member v3.0 ectd cluster running on a local machine.
#### 1. Check upgrade requirements
Is the cluster healthy and running v3.0.x?
```
$ ETCDCTL_API=3 etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 6.600684ms
localhost:22379 is healthy: successfully committed proposal: took = 8.540064ms
localhost:32379 is healthy: successfully committed proposal: took = 8.763432ms
$ curl http://localhost:2379/version
{"etcdserver":"3.0.16","etcdcluster":"3.0.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
2017-01-17 09:34:18.352662 I | raft: raft.node: 1640829d9eea5cfb elected leader 1640829d9eea5cfb at term 5
2017-01-17 09:34:18.359630 W | etcdserver: failed to reach the peerURL(http://localhost:2380) of member fd32987dcd0511e0 (Get http://localhost:2380/version: dial tcp 127.0.0.1:2380: getsockopt: connection refused)
2017-01-17 09:34:18.359679 W | etcdserver: cannot get the version of member fd32987dcd0511e0 (Get http://localhost:2380/version: dial tcp 127.0.0.1:2380: getsockopt: connection refused)
2017-01-17 09:34:18.548116 W | rafthttp: lost the TCP streaming connection with peer fd32987dcd0511e0 (stream Message writer)
2017-01-17 09:34:19.147816 W | rafthttp: lost the TCP streaming connection with peer fd32987dcd0511e0 (stream MsgApp v2 writer)
2017-01-17 09:34:34.364907 W | etcdserver: failed to reach the peerURL(http://localhost:2380) of member fd32987dcd0511e0 (Get http://localhost:2380/version: dial tcp 127.0.0.1:2380: getsockopt: connection refused)
```
It's a good idea at this point to [backup the etcd data](../op-guide/maintenance.md#snapshot-backup) to provide a downgrade path should any problems occur:
```
$ etcdctl snapshot save backup.db
```
#### 3. Drop-in etcd v3.1 binary and start the new etcd process
The new v3.1 etcd will publish its information to the cluster:
```
2017-01-17 09:36:00.996590 I | etcdserver: published {Name:my-etcd-1 ClientURLs:[http://localhost:2379]} to cluster 46bc3ce73049e678
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.1 etcd binary:
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:22379 is healthy: successfully committed proposal: took = 5.540129ms
localhost:32379 is healthy: successfully committed proposal: took = 7.321671ms
localhost:2379 is healthy: successfully committed proposal: took = 10.629901ms
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.1:
```
2017-01-17 09:36:38.406268 W | etcdserver: the local etcd version 3.0.16 is not up-to-date
2017-01-17 09:36:38.406295 W | etcdserver: member fd32987dcd0511e0 has a higher version 3.1.0
2017-01-17 09:36:42.407695 W | etcdserver: the local etcd version 3.0.16 is not up-to-date
2017-01-17 09:36:42.407730 W | etcdserver: member fd32987dcd0511e0 has a higher version 3.1.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.1 successfully:
```
2017-01-17 09:37:03.100015 I | etcdserver: updating the cluster version from 3.0 to 3.1
2017-01-17 09:37:03.104263 N | etcdserver/membership: updated the cluster version from 3.0 to 3.1
2017-01-17 09:37:03.104374 I | etcdserver/api: enabled capabilities for version 3.1
```
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 2.312897ms
localhost:22379 is healthy: successfully committed proposal: took = 2.553476ms
localhost:32379 is healthy: successfully committed proposal: took = 2.516902ms
```
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,338 @@
## Upgrade etcd from 3.1 to 3.2
In the general case, upgrading from etcd 3.1 to 3.2 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v3.1 processes and replace them with etcd v3.2 processes
- after running all v3.2 processes, new features in v3.2 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade checklists
**NOTE:** When [migrating from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
Highlighted breaking changes in 3.2.
#### Change in default `snapshot-count` value
The default value of `--snapshot-count` has [changed from from 10,000 to 100,000](https://github.com/coreos/etcd/pull/7160). Higher snapshot count means it holds Raft entries in memory for longer before discarding old entries. It is a trade-off between less frequent snapshotting and [higher memory usage](https://github.com/kubernetes/kubernetes/issues/60589#issuecomment-371977156). Higher `--snapshot-count` will be manifested with higher memory usage, while retaining more Raft entries helps with the availabilities of slow followers: leader is still able to replicate its logs to followers, rather than forcing followers to rebuild its stores from leader snapshots.
#### Change in gRPC dependency (>=3.2.10)
3.2.10 or later now requires [grpc/grpc-go](https://github.com/grpc/grpc-go/releases) `v1.7.5` (<=3.2.9 requires `v1.2.1`).
##### Deprecate `grpclog.Logger`
`grpclog.Logger` has been deprecated in favor of [`grpclog.LoggerV2`](https://github.com/grpc/grpc-go/blob/master/grpclog/loggerv2.go). `clientv3.Logger` is now `grpclog.LoggerV2`.
Before
```go
import "github.com/coreos/etcd/clientv3"
clientv3.SetLogger(log.New(os.Stderr, "grpc: ", 0))
```
After
```go
import "github.com/coreos/etcd/clientv3"
import "google.golang.org/grpc/grpclog"
clientv3.SetLogger(grpclog.NewLoggerV2(os.Stderr, os.Stderr, os.Stderr))
// log.New above cannot be used (not implement grpclog.LoggerV2 interface)
```
##### Deprecate `grpc.ErrClientConnTimeout`
Previously, `grpc.ErrClientConnTimeout` error is returned on client dial time-outs. 3.2 instead returns `context.DeadlineExceeded` (see [#8504](https://github.com/coreos/etcd/issues/8504)).
Before
```go
// expect dial time-out on ipv4 blackhole
_, err := clientv3.New(clientv3.Config{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: 2 * time.Second
})
if err == grpc.ErrClientConnTimeout {
// handle errors
}
```
After
```go
_, err := clientv3.New(clientv3.Config{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: 2 * time.Second
})
if err == context.DeadlineExceeded {
// handle errors
}
```
#### Change in maximum request size limits (>=3.2.10)
3.2.10 and 3.2.11 allow custom request size limits in server side. >=3.2.12 allows custom request size limits for both server and **client side**. In previous versions(v3.2.10, v3.2.11), client response size was limited to only 4 MiB.
Server-side request limits can be configured with `--max-request-bytes` flag:
```bash
# limits request size to 1.5 KiB
etcd --max-request-bytes 1536
# client writes exceeding 1.5 KiB will be rejected
etcdctl put foo [LARGE VALUE...]
# etcdserver: request is too large
```
Or configure `embed.Config.MaxRequestBytes` field:
```go
import "github.com/coreos/etcd/embed"
import "github.com/coreos/etcd/etcdserver/api/v3rpc/rpctypes"
// limit requests to 5 MiB
cfg := embed.NewConfig()
cfg.MaxRequestBytes = 5 * 1024 * 1024
// client writes exceeding 5 MiB will be rejected
_, err := cli.Put(ctx, "foo", [LARGE VALUE...])
err == rpctypes.ErrRequestTooLarge
```
**If not specified, server-side limit defaults to 1.5 MiB**.
Client-side request limits must be configured based on server-side limits.
```bash
# limits request size to 1 MiB
etcd --max-request-bytes 1048576
```
```go
import "github.com/coreos/etcd/clientv3"
cli, _ := clientv3.New(clientv3.Config{
Endpoints: []string{"127.0.0.1:2379"},
MaxCallSendMsgSize: 2 * 1024 * 1024,
MaxCallRecvMsgSize: 3 * 1024 * 1024,
})
// client writes exceeding "--max-request-bytes" will be rejected from etcd server
_, err := cli.Put(ctx, "foo", strings.Repeat("a", 1*1024*1024+5))
err == rpctypes.ErrRequestTooLarge
// client writes exceeding "MaxCallSendMsgSize" will be rejected from client-side
_, err = cli.Put(ctx, "foo", strings.Repeat("a", 5*1024*1024))
err.Error() == "rpc error: code = ResourceExhausted desc = grpc: trying to send message larger than max (5242890 vs. 2097152)"
// some writes under limits
for i := range []int{0,1,2,3,4} {
_, err = cli.Put(ctx, fmt.Sprintf("foo%d", i), strings.Repeat("a", 1*1024*1024-500))
if err != nil {
panic(err)
}
}
// client reads exceeding "MaxCallRecvMsgSize" will be rejected from client-side
_, err = cli.Get(ctx, "foo", clientv3.WithPrefix())
err.Error() == "rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5240509 vs. 3145728)"
```
**If not specified, client-side send limit defaults to 2 MiB (1.5 MiB + gRPC overhead bytes) and receive limit to `math.MaxInt32`**. Please see [clientv3 godoc](https://godoc.org/github.com/coreos/etcd/clientv3#Config) for more detail.
#### Change in raw gRPC client wrappers
3.2.12 or later changes the function signatures of `clientv3` gRPC client wrapper. This change was needed to support [custom `grpc.CallOption` on message size limits](https://github.com/coreos/etcd/pull/9047).
Before and after
```diff
-func NewKVFromKVClient(remote pb.KVClient) KV {
+func NewKVFromKVClient(remote pb.KVClient, c *Client) KV {
-func NewClusterFromClusterClient(remote pb.ClusterClient) Cluster {
+func NewClusterFromClusterClient(remote pb.ClusterClient, c *Client) Cluster {
-func NewLeaseFromLeaseClient(remote pb.LeaseClient, keepAliveTimeout time.Duration) Lease {
+func NewLeaseFromLeaseClient(remote pb.LeaseClient, c *Client, keepAliveTimeout time.Duration) Lease {
-func NewMaintenanceFromMaintenanceClient(remote pb.MaintenanceClient) Maintenance {
+func NewMaintenanceFromMaintenanceClient(remote pb.MaintenanceClient, c *Client) Maintenance {
-func NewWatchFromWatchClient(wc pb.WatchClient) Watcher {
+func NewWatchFromWatchClient(wc pb.WatchClient, c *Client) Watcher {
```
#### Change in `clientv3.Lease.TimeToLive` API
Previously, `clientv3.Lease.TimeToLive` API returned `lease.ErrLeaseNotFound` on non-existent lease ID. 3.2 instead returns TTL=-1 in its response and no error (see [#7305](https://github.com/coreos/etcd/pull/7305)).
Before
```go
// when leaseID does not exist
resp, err := TimeToLive(ctx, leaseID)
resp == nil
err == lease.ErrLeaseNotFound
```
After
```go
// when leaseID does not exist
resp, err := TimeToLive(ctx, leaseID)
resp.TTL == -1
err == nil
```
#### Change in `clientv3.NewFromConfigFile`
`clientv3.NewFromConfigFile` is moved to `yaml.NewConfig`.
Before
```go
import "github.com/coreos/etcd/clientv3"
clientv3.NewFromConfigFile
```
After
```go
import clientv3yaml "github.com/coreos/etcd/clientv3/yaml"
clientv3yaml.NewConfig
```
#### Change in `--listen-peer-urls` and `--listen-client-urls`
3.2 now rejects domains names for `--listen-peer-urls` and `--listen-client-urls` (3.1 only prints out warnings), since domain name is invalid for network interface binding. Make sure that those URLs are properly formated as `scheme://IP:port`.
See [issue #6336](https://github.com/coreos/etcd/issues/6336) for more contexts.
### Server upgrade checklists
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.2, the running cluster must be 3.1 or greater. If it's before 3.1, please [upgrade to 3.1](upgrade_3_1.md) before upgrading to 3.2.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl endpoint health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data](../op-guide/maintenance.md#snapshot-backup). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version. Please note that the `snapshot` command only backs up the v3 data. For v2 data, see [backing up v2 datastore](../v2/admin_guide.md#backing-up-the-datastore).
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.2. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
If the cluster is serving a v2 data set larger than 50MB, each newly upgraded member may take up to two minutes to catch up with the existing cluster. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and we'll be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.2, the cluster will be upgraded to v3.2, and downgrade from this completed state is **not possible**. If any single member is still v3.1, however, the cluster and its operations remains "v3.1", and it is possible from this mixed cluster state to return to using a v3.1 etcd binary on all members.
Please [backup the data directory](../op-guide/maintenance.md#snapshot-backup) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example shows how to upgrade a 3-member v3.1 ectd cluster running on a local machine.
#### 1. Check upgrade requirements
Is the cluster healthy and running v3.1.x?
```
$ ETCDCTL_API=3 etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 6.600684ms
localhost:22379 is healthy: successfully committed proposal: took = 8.540064ms
localhost:32379 is healthy: successfully committed proposal: took = 8.763432ms
$ curl http://localhost:2379/version
{"etcdserver":"3.1.7","etcdcluster":"3.1.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
2017-04-27 14:13:31.491746 I | raft: c89feb932daef420 [term 3] received MsgTimeoutNow from 6d4f535bae3ab960 and starts an election to get leadership.
2017-04-27 14:13:31.491769 I | raft: c89feb932daef420 became candidate at term 4
2017-04-27 14:13:31.491788 I | raft: c89feb932daef420 received MsgVoteResp from c89feb932daef420 at term 4
2017-04-27 14:13:31.491797 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 6d4f535bae3ab960 at term 4
2017-04-27 14:13:31.491805 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 9eda174c7df8a033 at term 4
2017-04-27 14:13:31.491815 I | raft: raft.node: c89feb932daef420 lost leader 6d4f535bae3ab960 at term 4
2017-04-27 14:13:31.524084 I | raft: c89feb932daef420 received MsgVoteResp from 6d4f535bae3ab960 at term 4
2017-04-27 14:13:31.524108 I | raft: c89feb932daef420 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
2017-04-27 14:13:31.524123 I | raft: c89feb932daef420 became leader at term 4
2017-04-27 14:13:31.524136 I | raft: raft.node: c89feb932daef420 elected leader c89feb932daef420 at term 4
2017-04-27 14:13:31.592650 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream MsgApp v2 reader)
2017-04-27 14:13:31.592825 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message reader)
2017-04-27 14:13:31.693275 E | rafthttp: failed to dial 6d4f535bae3ab960 on stream Message (dial tcp [::1]:2380: getsockopt: connection refused)
2017-04-27 14:13:31.693289 I | rafthttp: peer 6d4f535bae3ab960 became inactive
2017-04-27 14:13:31.936678 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message writer)
```
It's a good idea at this point to [backup the etcd data](../op-guide/maintenance.md#snapshot-backup) to provide a downgrade path should any problems occur:
```
$ etcdctl snapshot save backup.db
```
#### 3. Drop-in etcd v3.2 binary and start the new etcd process
The new v3.2 etcd will publish its information to the cluster:
```
2017-04-27 14:14:25.363225 I | etcdserver: published {Name:s1 ClientURLs:[http://localhost:2379]} to cluster a9ededbffcb1b1f1
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.2 etcd binary:
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:22379 is healthy: successfully committed proposal: took = 5.540129ms
localhost:32379 is healthy: successfully committed proposal: took = 7.321771ms
localhost:2379 is healthy: successfully committed proposal: took = 10.629901ms
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.2:
```
2017-04-27 14:15:17.071804 W | etcdserver: member c89feb932daef420 has a higher version 3.2.0
2017-04-27 14:15:21.073110 W | etcdserver: the local etcd version 3.1.7 is not up-to-date
2017-04-27 14:15:21.073142 W | etcdserver: member 6d4f535bae3ab960 has a higher version 3.2.0
2017-04-27 14:15:21.073157 W | etcdserver: the local etcd version 3.1.7 is not up-to-date
2017-04-27 14:15:21.073164 W | etcdserver: member c89feb932daef420 has a higher version 3.2.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.2 successfully:
```
2017-04-27 14:15:54.536901 N | etcdserver/membership: updated the cluster version from 3.1 to 3.2
2017-04-27 14:15:54.537035 I | etcdserver/api: enabled capabilities for version 3.2
```
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 2.312897ms
localhost:22379 is healthy: successfully committed proposal: took = 2.553476ms
localhost:32379 is healthy: successfully committed proposal: took = 2.517902ms
```
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,476 @@
## Upgrade etcd from 3.2 to 3.3
In the general case, upgrading from etcd 3.2 to 3.3 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v3.2 processes and replace them with etcd v3.3 processes
- after running all v3.3 processes, new features in v3.3 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade checklists
**NOTE:** When [migrating from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
Highlighted breaking changes in 3.3.
#### Change in `etcdserver.EtcdServer` struct
`etcdserver.EtcdServer` has changed the type of its member field `*etcdserver.ServerConfig` to `etcdserver.ServerConfig`. And `etcdserver.NewServer` now takes `etcdserver.ServerConfig`, instead of `*etcdserver.ServerConfig`.
Before and after (e.g. [k8s.io/kubernetes/test/e2e_node/services/etcd.go](https://github.com/kubernetes/kubernetes/blob/release-1.8/test/e2e_node/services/etcd.go#L50-L55))
```diff
import "github.com/coreos/etcd/etcdserver"
type EtcdServer struct {
*etcdserver.EtcdServer
- config *etcdserver.ServerConfig
+ config etcdserver.ServerConfig
}
func NewEtcd(dataDir string) *EtcdServer {
- config := &etcdserver.ServerConfig{
+ config := etcdserver.ServerConfig{
DataDir: dataDir,
...
}
return &EtcdServer{config: config}
}
func (e *EtcdServer) Start() error {
var err error
e.EtcdServer, err = etcdserver.NewServer(e.config)
...
```
#### Change in `embed.EtcdServer` struct
Field `LogOutput` is added to `embed.Config`:
```diff
package embed
type Config struct {
Debug bool `json:"debug"`
LogPkgLevels string `json:"log-package-levels"`
+ LogOutput string `json:"log-output"`
...
```
Before gRPC server warnings were logged in etcdserver.
```
WARNING: 2017/11/02 11:35:51 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp: operation was canceled"; Reconnecting to {localhost:2379 <nil>}
WARNING: 2017/11/02 11:35:51 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp: operation was canceled"; Reconnecting to {localhost:2379 <nil>}
```
From v3.3, gRPC server logs are disabled by default.
```go
import "github.com/coreos/etcd/embed"
cfg := &embed.Config{Debug: false}
cfg.SetupLogging()
```
Set `embed.Config.Debug` field to `true` to enable gRPC server logs.
#### Change in `/health` endpoint response
Previously, `[endpoint]:[client-port]/health` returned manually marshaled JSON value. 3.3 now defines [`etcdhttp.Health`](https://godoc.org/github.com/coreos/etcd/etcdserver/api/etcdhttp#Health) struct.
Note that in v3.3.0-rc.0, v3.3.0-rc.1, and v3.3.0-rc.2, `etcdhttp.Health` has boolean type `"health"` and `"errors"` fields. For backward compatibilities, we reverted `"health"` field to `string` type and removed `"errors"` field. Further health information will be provided in separate APIs.
```bash
$ curl http://localhost:2379/health
{"health":"true"}
```
#### Change in gRPC gateway HTTP endpoints (replaced `/v3alpha` with `/v3beta`)
Before
```bash
curl -L http://localhost:2379/v3alpha/kv/put \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}'
```
After
```bash
curl -L http://localhost:2379/v3beta/kv/put \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}'
```
Requests to `/v3alpha` endpoints will redirect to `/v3beta`, and `/v3alpha` will be removed in 3.4 release.
#### Change in maximum request size limits
3.3 now allows custom request size limits for both server and **client side**. In previous versions(v3.2.10, v3.2.11), client response size was limited to only 4 MiB.
Server-side request limits can be configured with `--max-request-bytes` flag:
```bash
# limits request size to 1.5 KiB
etcd --max-request-bytes 1536
# client writes exceeding 1.5 KiB will be rejected
etcdctl put foo [LARGE VALUE...]
# etcdserver: request is too large
```
Or configure `embed.Config.MaxRequestBytes` field:
```go
import "github.com/coreos/etcd/embed"
import "github.com/coreos/etcd/etcdserver/api/v3rpc/rpctypes"
// limit requests to 5 MiB
cfg := embed.NewConfig()
cfg.MaxRequestBytes = 5 * 1024 * 1024
// client writes exceeding 5 MiB will be rejected
_, err := cli.Put(ctx, "foo", [LARGE VALUE...])
err == rpctypes.ErrRequestTooLarge
```
**If not specified, server-side limit defaults to 1.5 MiB**.
Client-side request limits must be configured based on server-side limits.
```bash
# limits request size to 1 MiB
etcd --max-request-bytes 1048576
```
```go
import "github.com/coreos/etcd/clientv3"
cli, _ := clientv3.New(clientv3.Config{
Endpoints: []string{"127.0.0.1:2379"},
MaxCallSendMsgSize: 2 * 1024 * 1024,
MaxCallRecvMsgSize: 3 * 1024 * 1024,
})
// client writes exceeding "--max-request-bytes" will be rejected from etcd server
_, err := cli.Put(ctx, "foo", strings.Repeat("a", 1*1024*1024+5))
err == rpctypes.ErrRequestTooLarge
// client writes exceeding "MaxCallSendMsgSize" will be rejected from client-side
_, err = cli.Put(ctx, "foo", strings.Repeat("a", 5*1024*1024))
err.Error() == "rpc error: code = ResourceExhausted desc = grpc: trying to send message larger than max (5242890 vs. 2097152)"
// some writes under limits
for i := range []int{0,1,2,3,4} {
_, err = cli.Put(ctx, fmt.Sprintf("foo%d", i), strings.Repeat("a", 1*1024*1024-500))
if err != nil {
panic(err)
}
}
// client reads exceeding "MaxCallRecvMsgSize" will be rejected from client-side
_, err = cli.Get(ctx, "foo", clientv3.WithPrefix())
err.Error() == "rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5240509 vs. 3145728)"
```
**If not specified, client-side send limit defaults to 2 MiB (1.5 MiB + gRPC overhead bytes) and receive limit to `math.MaxInt32`**. Please see [clientv3 godoc](https://godoc.org/github.com/coreos/etcd/clientv3#Config) for more detail.
#### Change in raw gRPC client wrappers
3.3 changes the function signatures of `clientv3` gRPC client wrapper. This change was needed to support [custom `grpc.CallOption` on message size limits](https://github.com/coreos/etcd/pull/9047).
Before and after
```diff
-func NewKVFromKVClient(remote pb.KVClient) KV {
+func NewKVFromKVClient(remote pb.KVClient, c *Client) KV {
-func NewClusterFromClusterClient(remote pb.ClusterClient) Cluster {
+func NewClusterFromClusterClient(remote pb.ClusterClient, c *Client) Cluster {
-func NewLeaseFromLeaseClient(remote pb.LeaseClient, keepAliveTimeout time.Duration) Lease {
+func NewLeaseFromLeaseClient(remote pb.LeaseClient, c *Client, keepAliveTimeout time.Duration) Lease {
-func NewMaintenanceFromMaintenanceClient(remote pb.MaintenanceClient) Maintenance {
+func NewMaintenanceFromMaintenanceClient(remote pb.MaintenanceClient, c *Client) Maintenance {
-func NewWatchFromWatchClient(wc pb.WatchClient) Watcher {
+func NewWatchFromWatchClient(wc pb.WatchClient, c *Client) Watcher {
```
#### Change in clientv3 `Snapshot` API error type
Previously, clientv3 `Snapshot` API returned raw [`grpc/*status.statusError`] type error. v3.3 now translates those errors to corresponding public error types, to be consistent with other APIs.
Before
```go
import "context"
// reading snapshot with canceled context should error out
ctx, cancel := context.WithCancel(context.Background())
rc, _ := cli.Snapshot(ctx)
cancel()
_, err := io.Copy(f, rc)
err.Error() == "rpc error: code = Canceled desc = context canceled"
// reading snapshot with deadline exceeded should error out
ctx, cancel = context.WithTimeout(context.Background(), time.Second)
defer cancel()
rc, _ = cli.Snapshot(ctx)
time.Sleep(2 * time.Second)
_, err = io.Copy(f, rc)
err.Error() == "rpc error: code = DeadlineExceeded desc = context deadline exceeded"
```
After
```go
import "context"
// reading snapshot with canceled context should error out
ctx, cancel := context.WithCancel(context.Background())
rc, _ := cli.Snapshot(ctx)
cancel()
_, err := io.Copy(f, rc)
err == context.Canceled
// reading snapshot with deadline exceeded should error out
ctx, cancel = context.WithTimeout(context.Background(), time.Second)
defer cancel()
rc, _ = cli.Snapshot(ctx)
time.Sleep(2 * time.Second)
_, err = io.Copy(f, rc)
err == context.DeadlineExceeded
```
#### Change in `etcdctl lease timetolive` command output
Previously, `lease timetolive LEASE_ID` command on expired lease prints `-1s` for remaining seconds. 3.3 now outputs clearer messages.
Before
```bash
lease 2d8257079fa1bc0c granted with TTL(0s), remaining(-1s)
```
After
```bash
lease 2d8257079fa1bc0c already expired
```
#### Change in `golang.org/x/net/context` imports
`clientv3` has deprecated `golang.org/x/net/context`. If a project vendors `golang.org/x/net/context` in other code (e.g. etcd generated protocol buffer code) and imports `github.com/coreos/etcd/clientv3`, it requires Go 1.9+ to compile.
Before
```go
import "golang.org/x/net/context"
cli.Put(context.Background(), "f", "v")
```
After
```go
import "context"
cli.Put(context.Background(), "f", "v")
```
#### Change in gRPC dependency
3.3 now requires [grpc/grpc-go](https://github.com/grpc/grpc-go/releases) `v1.7.5`.
##### Deprecate `grpclog.Logger`
`grpclog.Logger` has been deprecated in favor of [`grpclog.LoggerV2`](https://github.com/grpc/grpc-go/blob/master/grpclog/loggerv2.go). `clientv3.Logger` is now `grpclog.LoggerV2`.
Before
```go
import "github.com/coreos/etcd/clientv3"
clientv3.SetLogger(log.New(os.Stderr, "grpc: ", 0))
```
After
```go
import "github.com/coreos/etcd/clientv3"
import "google.golang.org/grpc/grpclog"
clientv3.SetLogger(grpclog.NewLoggerV2(os.Stderr, os.Stderr, os.Stderr))
// log.New above cannot be used (not implement grpclog.LoggerV2 interface)
```
##### Deprecate `grpc.ErrClientConnTimeout`
Previously, `grpc.ErrClientConnTimeout` error is returned on client dial time-outs. 3.3 instead returns `context.DeadlineExceeded` (see [#8504](https://github.com/coreos/etcd/issues/8504)).
Before
```go
// expect dial time-out on ipv4 blackhole
_, err := clientv3.New(clientv3.Config{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: 2 * time.Second
})
if err == grpc.ErrClientConnTimeout {
// handle errors
}
```
After
```go
_, err := clientv3.New(clientv3.Config{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: 2 * time.Second
})
if err == context.DeadlineExceeded {
// handle errors
}
```
#### Change in official container registry
etcd now uses [`gcr.io/etcd-development/etcd`](https://gcr.io/etcd-development/etcd) as a primary container registry, and [`quay.io/coreos/etcd`](https://quay.io/coreos/etcd) as secondary.
Before
```bash
docker pull quay.io/coreos/etcd:v3.2.5
```
After
```bash
docker pull gcr.io/etcd-development/etcd:v3.3.0
```
### Server upgrade checklists
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.3, the running cluster must be 3.2 or greater. If it's before 3.2, please [upgrade to 3.2](upgrade_3_2.md) before upgrading to 3.3.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl endpoint health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data](../op-guide/maintenance.md#snapshot-backup). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version. Please note that the `snapshot` command only backs up the v3 data. For v2 data, see [backing up v2 datastore](../v2/admin_guide.md#backing-up-the-datastore).
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.3. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
If the cluster is serving a v2 data set larger than 50MB, each newly upgraded member may take up to two minutes to catch up with the existing cluster. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and we'll be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.3, the cluster will be upgraded to v3.3, and downgrade from this completed state is **not possible**. If any single member is still v3.2, however, the cluster and its operations remains "v3.2", and it is possible from this mixed cluster state to return to using a v3.2 etcd binary on all members.
Please [backup the data directory](../op-guide/maintenance.md#snapshot-backup) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example shows how to upgrade a 3-member v3.2 ectd cluster running on a local machine.
#### 1. Check upgrade requirements
Is the cluster healthy and running v3.2.x?
```
$ ETCDCTL_API=3 etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 6.600684ms
localhost:22379 is healthy: successfully committed proposal: took = 8.540064ms
localhost:32379 is healthy: successfully committed proposal: took = 8.763432ms
$ curl http://localhost:2379/version
{"etcdserver":"3.2.7","etcdcluster":"3.2.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
14:13:31.491746 I | raft: c89feb932daef420 [term 3] received MsgTimeoutNow from 6d4f535bae3ab960 and starts an election to get leadership.
14:13:31.491769 I | raft: c89feb932daef420 became candidate at term 4
14:13:31.491788 I | raft: c89feb932daef420 received MsgVoteResp from c89feb932daef420 at term 4
14:13:31.491797 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 6d4f535bae3ab960 at term 4
14:13:31.491805 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 9eda174c7df8a033 at term 4
14:13:31.491815 I | raft: raft.node: c89feb932daef420 lost leader 6d4f535bae3ab960 at term 4
14:13:31.524084 I | raft: c89feb932daef420 received MsgVoteResp from 6d4f535bae3ab960 at term 4
14:13:31.524108 I | raft: c89feb932daef420 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
14:13:31.524123 I | raft: c89feb932daef420 became leader at term 4
14:13:31.524136 I | raft: raft.node: c89feb932daef420 elected leader c89feb932daef420 at term 4
14:13:31.592650 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream MsgApp v2 reader)
14:13:31.592825 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message reader)
14:13:31.693275 E | rafthttp: failed to dial 6d4f535bae3ab960 on stream Message (dial tcp [::1]:2380: getsockopt: connection refused)
14:13:31.693289 I | rafthttp: peer 6d4f535bae3ab960 became inactive
14:13:31.936678 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message writer)
```
It's a good idea at this point to [backup the etcd data](../op-guide/maintenance.md#snapshot-backup) to provide a downgrade path should any problems occur:
```
$ etcdctl snapshot save backup.db
```
#### 3. Drop-in etcd v3.3 binary and start the new etcd process
The new v3.3 etcd will publish its information to the cluster:
```
14:14:25.363225 I | etcdserver: published {Name:s1 ClientURLs:[http://localhost:2379]} to cluster a9ededbffcb1b1f1
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.3 etcd binary:
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:22379 is healthy: successfully committed proposal: took = 5.540129ms
localhost:32379 is healthy: successfully committed proposal: took = 7.321771ms
localhost:2379 is healthy: successfully committed proposal: took = 10.629901ms
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.3:
```
14:15:17.071804 W | etcdserver: member c89feb932daef420 has a higher version 3.3.0
14:15:21.073110 W | etcdserver: the local etcd version 3.2.7 is not up-to-date
14:15:21.073142 W | etcdserver: member 6d4f535bae3ab960 has a higher version 3.3.0
14:15:21.073157 W | etcdserver: the local etcd version 3.2.7 is not up-to-date
14:15:21.073164 W | etcdserver: member c89feb932daef420 has a higher version 3.3.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.3 successfully:
```
14:15:54.536901 N | etcdserver/membership: updated the cluster version from 3.2 to 3.3
14:15:54.537035 I | etcdserver/api: enabled capabilities for version 3.3
```
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 2.312897ms
localhost:22379 is healthy: successfully committed proposal: took = 2.553476ms
localhost:32379 is healthy: successfully committed proposal: took = 2.517902ms
```
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,171 @@
## Upgrade etcd from 3.3 to 3.4
In the general case, upgrading from etcd 3.3 to 3.4 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v3.3 processes and replace them with etcd v3.4 processes
- after running all v3.4 processes, new features in v3.4 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade checklists
**NOTE:** When [migrating from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480), etcd server v3.2+ panics when etcd restores from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This also prevents accidental v3 data loss (e.g. `db` file might have been moved). etcd requires that post v3 migration can only happen with v3 data. Do not upgrade to newer v3 versions until v3.0 server contains v3 data.
Highlighted breaking changes in 3.4.
#### Change in `etcd` flags
`--ca-file` and `--peer-ca-file` flags are deprecated; they have been deprecated since v2.1.
```diff
-etcd --ca-file ca-client.crt
+etcd --trusted-ca-file ca-client.crt
```
```diff
-etcd --peer-ca-file ca-peer.crt
+etcd --peer-trusted-ca-file ca-peer.crt
```
#### Change in ``pkg/transport`
Deprecated `pkg/transport.TLSInfo.CAFile` field.
```diff
import "github.com/coreos/etcd/pkg/transport"
tlsInfo := transport.TLSInfo{
CertFile: "/tmp/test-certs/test.pem",
KeyFile: "/tmp/test-certs/test-key.pem",
- CAFile: "/tmp/test-certs/trusted-ca.pem",
+ TrustedCAFile: "/tmp/test-certs/trusted-ca.pem",
}
tlsConfig, err := tlsInfo.ClientConfig()
if err != nil {
panic(err)
}
```
### Server upgrade checklists
#### Upgrade requirements
To upgrade an existing etcd deployment to 3.4, the running cluster must be 3.3 or greater. If it's before 3.3, please [upgrade to 3.3](upgrade_3_3.md) before upgrading to 3.4.
Also, to ensure a smooth rolling upgrade, the running cluster must be healthy. Check the health of the cluster by using the `etcdctl endpoint health` command before proceeding.
#### Preparation
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
Before beginning, [backup the etcd data](../op-guide/maintenance.md#snapshot-backup). Should something go wrong with the upgrade, it is possible to use this backup to [downgrade](#downgrade) back to existing etcd version. Please note that the `snapshot` command only backs up the v3 data. For v2 data, see [backing up v2 datastore](../v2/admin_guide.md#backing-up-the-datastore).
#### Mixed versions
While upgrading, an etcd cluster supports mixed versions of etcd members, and operates with the protocol of the lowest common version. The cluster is only considered upgraded once all of its members are upgraded to version 3.4. Internally, etcd members negotiate with each other to determine the overall cluster version, which controls the reported version and the supported features.
#### Limitations
Note: If the cluster only has v3 data and no v2 data, it is not subject to this limitation.
If the cluster is serving a v2 data set larger than 50MB, each newly upgraded member may take up to two minutes to catch up with the existing cluster. Check the size of a recent snapshot to estimate the total data size. In other words, it is safest to wait for 2 minutes between upgrading each member.
For a much larger total data size, 100MB or more , this one-time process might take even more time. Administrators of very large etcd clusters of this magnitude can feel free to contact the [etcd team][etcd-contact] before upgrading, and we'll be happy to provide advice on the procedure.
#### Downgrade
If all members have been upgraded to v3.4, the cluster will be upgraded to v3.4, and downgrade from this completed state is **not possible**. If any single member is still v3.3, however, the cluster and its operations remains "v3.3", and it is possible from this mixed cluster state to return to using a v3.3 etcd binary on all members.
Please [backup the data directory](../op-guide/maintenance.md#snapshot-backup) of all etcd members to make downgrading the cluster possible even after it has been completely upgraded.
### Upgrade procedure
This example shows how to upgrade a 3-member v3.3 ectd cluster running on a local machine.
#### 1. Check upgrade requirements
Is the cluster healthy and running v3.3.x?
```
$ ETCDCTL_API=3 etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 6.600684ms
localhost:22379 is healthy: successfully committed proposal: took = 8.540064ms
localhost:32379 is healthy: successfully committed proposal: took = 8.763432ms
$ curl http://localhost:2379/version
{"etcdserver":"3.3.0","etcdcluster":"3.3.0"}
```
#### 2. Stop the existing etcd process
When each etcd process is stopped, expected errors will be logged by other cluster members. This is normal since a cluster member connection has been (temporarily) broken:
```
14:13:31.491746 I | raft: c89feb932daef420 [term 3] received MsgTimeoutNow from 6d4f535bae3ab960 and starts an election to get leadership.
14:13:31.491769 I | raft: c89feb932daef420 became candidate at term 4
14:13:31.491788 I | raft: c89feb932daef420 received MsgVoteResp from c89feb932daef420 at term 4
14:13:31.491797 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 6d4f535bae3ab960 at term 4
14:13:31.491805 I | raft: c89feb932daef420 [logterm: 3, index: 9] sent MsgVote request to 9eda174c7df8a033 at term 4
14:13:31.491815 I | raft: raft.node: c89feb932daef420 lost leader 6d4f535bae3ab960 at term 4
14:13:31.524084 I | raft: c89feb932daef420 received MsgVoteResp from 6d4f535bae3ab960 at term 4
14:13:31.524108 I | raft: c89feb932daef420 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
14:13:31.524123 I | raft: c89feb932daef420 became leader at term 4
14:13:31.524136 I | raft: raft.node: c89feb932daef420 elected leader c89feb932daef420 at term 4
14:13:31.592650 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream MsgApp v2 reader)
14:13:31.592825 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message reader)
14:13:31.693275 E | rafthttp: failed to dial 6d4f535bae3ab960 on stream Message (dial tcp [::1]:2380: getsockopt: connection refused)
14:13:31.693289 I | rafthttp: peer 6d4f535bae3ab960 became inactive
14:13:31.936678 W | rafthttp: lost the TCP streaming connection with peer 6d4f535bae3ab960 (stream Message writer)
```
It's a good idea at this point to [backup the etcd data](../op-guide/maintenance.md#snapshot-backup) to provide a downgrade path should any problems occur:
```
$ etcdctl snapshot save backup.db
```
#### 3. Drop-in etcd v3.4 binary and start the new etcd process
The new v3.4 etcd will publish its information to the cluster:
```
14:14:25.363225 I | etcdserver: published {Name:s1 ClientURLs:[http://localhost:2379]} to cluster a9ededbffcb1b1f1
```
Verify that each member, and then the entire cluster, becomes healthy with the new v3.4 etcd binary:
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:22379 is healthy: successfully committed proposal: took = 5.540129ms
localhost:32379 is healthy: successfully committed proposal: took = 7.321771ms
localhost:2379 is healthy: successfully committed proposal: took = 10.629901ms
```
Upgraded members will log warnings like the following until the entire cluster is upgraded. This is expected and will cease after all etcd cluster members are upgraded to v3.4:
```
14:15:17.071804 W | etcdserver: member c89feb932daef420 has a higher version 3.4.0
14:15:21.073110 W | etcdserver: the local etcd version 3.3.0 is not up-to-date
14:15:21.073142 W | etcdserver: member 6d4f535bae3ab960 has a higher version 3.4.0
14:15:21.073157 W | etcdserver: the local etcd version 3.3.0 is not up-to-date
14:15:21.073164 W | etcdserver: member c89feb932daef420 has a higher version 3.4.0
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, the cluster will report upgrading to 3.4 successfully:
```
14:15:54.536901 N | etcdserver/membership: updated the cluster version from 3.3 to 3.4
14:15:54.537035 I | etcdserver/api: enabled capabilities for version 3.4
```
```
$ ETCDCTL_API=3 /etcdctl endpoint health --endpoints=localhost:2379,localhost:22379,localhost:32379
localhost:2379 is healthy: successfully committed proposal: took = 2.312897ms
localhost:22379 is healthy: successfully committed proposal: took = 2.553476ms
localhost:32379 is healthy: successfully committed proposal: took = 2.517902ms
```
[etcd-contact]: https://groups.google.com/forum/#!forum/etcd-dev

View File

@ -0,0 +1,19 @@
# Upgrading etcd clusters and applications
This section contains documents specific to upgrading etcd clusters and applications.
## Moving from etcd API v2 to API v3
* [Migrate applications from using API v2 to API v3][migrate-apps]
## Upgrading an etcd v3.x cluster
* [Upgrade etcd from 3.0 to 3.1][upgrade-3-1]
* [Upgrade etcd from 3.1 to 3.2][upgrade-3-2]
## Upgrading from etcd v2.3
* [Upgrade a v2.3 cluster to v3.0][upgrade-cluster]
[migrate-apps]: ../op-guide/v2-migration.md
[upgrade-cluster]: upgrade_3_0.md
[upgrade-3-1]: upgrade_3_1.md
[upgrade-3-2]: upgrade_3_2.md

View File

@ -67,13 +67,13 @@ You have successfully started an etcd and written a key to the store.
The [official etcd ports][iana-ports] are 2379 for client requests, and 2380 for peer communication. To maintain compatibility, some etcd configuration and documentation continues to refer to the legacy ports 4001 and 7001, but all new etcd use and discussion should adopt the IANA-assigned ports. The legacy ports 4001 and 7001 will be fully deprecated, and support for their use removed, in future etcd releases.
[iana-ports]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=etcd
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
### Running local etcd cluster
First install [goreman](https://github.com/mattn/goreman), which manages Procfile-based applications.
Our [Procfile script](./Procfile) will set up a local example cluster. You can start it with:
Our [Procfile script](../../V2Procfile) will set up a local example cluster. You can start it with:
```sh
goreman start
@ -162,4 +162,4 @@ Currently only the amd64 architecture is officially supported by `etcd`.
### License
etcd is under the Apache 2.0 license. See the [LICENSE](LICENSE) file for details.
etcd is under the Apache 2.0 license. See the [LICENSE](../../LICENSE) file for details.

View File

@ -18,7 +18,7 @@ A keys lifetime spans a generation. Each key may have one or multiple generat
### Physical View
etcd stores the physical data as key-value pairs in a persistent [b+tree][b+tree]. Each revision of the stores state only contains the delta from its previous revision to be efficient. A single revision may correspond to multiple keys in the tree.
etcd stores the physical data as key-value pairs in a persistent [b+tree][b+tree]. Each revision of the stores state only contains the delta from its previous revision to be efficient. A single revision may correspond to multiple keys in the tree.
The key of key-value pair is a 3-tuple (major, sub, type). Major is the store revision holding the key. Sub differentiates among keys within the same revision. Type is an optional suffix for special value (e.g., `t` if the value contains a tombstone). The value of the key-value pair contains the modification from previous revision, thus one delta from previous revision. The b+tree is ordered by key in lexical byte-order. Ranged lookups over revision deltas are fast; this enables quickly finding modifications from one specific revision to another. Compaction removes out-of-date keys-value pairs.
@ -73,7 +73,7 @@ Any completed operations are durable. All accessible data is also durable data.
#### Linearizability
Linearizability (also known as Atomic Consistency or External Consistency) is a consistency level between strict consistency and sequential consistency.
Linearizability (also known as Atomic Consistency or External Consistency) is a consistency level between strict consistency and sequential consistency.
For linearizability, suppose each operation receives a timestamp from a loosely synchronized global clock. Operations are linearized if and only if they always complete as though they were executed in a sequential order and each operation appears to complete in the order specified by the program. Likewise, if an operations timestamp precedes another, that operation must also precede the other operation in the sequence.
@ -83,10 +83,10 @@ etcd does not ensure linearizability for watch operations. Users are expected to
etcd ensures linearizability for all other operations by default. Linearizability comes with a cost, however, because linearized requests must go through the Raft consensus process. To obtain lower latencies and higher throughput for read requests, clients can configure a requests consistency mode to `serializable`, which may access stale data with respect to quorum, but removes the performance penalty of linearized accesses' reliance on live consensus.
[persistent-ds]: [https://en.wikipedia.org/wiki/Persistent_data_structure]
[btree]: [https://en.wikipedia.org/wiki/B-tree]
[b+tree]: [https://en.wikipedia.org/wiki/B%2B_tree]
[seq_consistency]: [https://en.wikipedia.org/wiki/Consistency_model#Sequential_consistency]
[strict_consistency]: [https://en.wikipedia.org/wiki/Consistency_model#Strict_consistency]
[serializable_isolation]: [https://en.wikipedia.org/wiki/Isolation_(database_systems)#Serializable]
[Linearizability]: [#Linearizability]
[persistent-ds]: https://en.wikipedia.org/wiki/Persistent_data_structure
[btree]: https://en.wikipedia.org/wiki/B-tree
[b+tree]: https://en.wikipedia.org/wiki/B%2B_tree
[seq_consistency]: https://en.wikipedia.org/wiki/Consistency_model#Sequential_consistency
[strict_consistency]: https://en.wikipedia.org/wiki/Consistency_model#Strict_consistency
[serializable_isolation]: https://en.wikipedia.org/wiki/Isolation_(database_systems)#Serializable
[Linearizability]: #linearizability

View File

@ -32,7 +32,7 @@ The consistent flag for read operations is removed in etcd 2.0.0. The normal rea
The read consistency guarantees are:
The consistent read guarantees the sequential consistency within one client that talks to one etcd server. Read/Write from one client to one etcd member should be observed in order. If one client write a value to an etcd server successfully, it should be able to get the value out of the server immediately.
The consistent read guarantees the sequential consistency within one client that talks to one etcd server. Read/Write from one client to one etcd member should be observed in order. If one client write a value to an etcd server successfully, it should be able to get the value out of the server immediately.
Each etcd member will proxy the request to leader and only return the result to user after the result is applied on the local member. Thus after the write succeed, the user is guaranteed to see the value on the member it sent the request to.
@ -56,6 +56,7 @@ Proxy mode in 2.0 will provide similar functionality, and with improved control
## Discovery Service
A size key needs to be provided inside a [discovery token][discoverytoken].
[discoverytoken]: clustering.md#custom-etcd-discovery-service
## HTTP Admin API

View File

@ -49,4 +49,4 @@ Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send r
| 256 | 256 | all servers | 3061 | 119.3 |
[boom]: https://github.com/rakyll/boom
[hack-benchmark]: /hack/benchmark/
[hack-benchmark]: ../../../hack/benchmark/

View File

@ -24,7 +24,7 @@ Go OS/Arch: linux/amd64
## Testing
Bootstrap another machine, outside of the etcd cluster, and run the [`boom` HTTP benchmark tool](https://github.com/rakyll/boom) with a connection reuse patch to send requests to each etcd cluster member. See the [benchmark instructions](../../hack/benchmark/) for the patch and the steps to reproduce our procedures.
Bootstrap another machine, outside of the etcd cluster, and run the [`boom` HTTP benchmark tool][boom] with a connection reuse patch to send requests to each etcd cluster member. See the [benchmark instructions][hack] for the patch and the steps to reproduce our procedures.
The performance is calulated through results of 100 benchmark rounds.
@ -66,4 +66,7 @@ The performance is calulated through results of 100 benchmark rounds.
- Write QPS to cluster leaders seems to be increased by a small margin. This is because the main loop and entry apply loops were decoupled in the etcd raft logic, eliminating several blocks between them.
- Write QPS to all members seems to be increased by a significant margin, because followers now receive the latest commit index sooner, and commit proposals more quickly.
- Write QPS to all members seems to be increased by a significant margin, because followers now receive the latest commit index sooner, and commit proposals more quickly.
[boom]: https://github.com/rakyll/boom
[hack]: ../../../hack/benchmark/

View File

@ -69,4 +69,4 @@ Bootstrap another machine and use the [boom HTTP benchmark tool][boom] to send r
[boom]: https://github.com/rakyll/boom
[c7146bd5]: https://github.com/coreos/etcd/commits/c7146bd5f2c73716091262edc638401bb8229144
[etcd-2.1-benchmark]: etcd-2-1-0-alpha-benchmarks.md
[hack-benchmark]: /hack/benchmark/
[hack-benchmark]: ../../../hack/benchmark/

View File

@ -39,4 +39,4 @@ The performance is nearly the same as the one with empty server handler.
The performance with empty server handler is not affected by one put. So the
performance downgrade should be caused by storage package.
[etcd-v3-benchmark]: /tools/benchmark/
[etcd-v3-benchmark]: ../../../tools/benchmark/

View File

@ -423,7 +423,7 @@ To make understanding this feature easier, we changed the naming of some flags,
|-peers |none |Deprecated. The --initial-cluster flag provides a similar concept with different semantics. Please read this guide on cluster startup.|
|-peers-file |none |Deprecated. The --initial-cluster flag provides a similar concept with different semantics. Please read this guide on cluster startup.|
[client]: /client
[client]: ../../client
[client-discoverer]: https://godoc.org/github.com/coreos/etcd/client#Discoverer
[conf-adv-client]: configuration.md#-advertise-client-urls
[conf-listen-client]: configuration.md#-listen-client-urls

View File

@ -234,7 +234,7 @@ The security flags help to [build a secure etcd cluster][security].
+ env variable: ETCD_DEBUG
### --log-package-levels
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ default: none (INFO for all packages)
+ env variable: ETCD_LOG_PACKAGE_LEVELS
@ -272,7 +272,7 @@ Follow the instructions when using these flags.
[build-cluster]: clustering.md#static
[reconfig]: runtime-configuration.md
[discovery]: clustering.md#discovery
[iana-ports]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=etcd
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
[proxy]: proxy.md
[reconfig]: runtime-configuration.md
[restore]: admin_guide.md#restoring-a-backup

View File

@ -112,7 +112,6 @@
- [mattn/etcdenv](https://github.com/mattn/etcdenv) - "env" shebang with etcd integration
- [kelseyhightower/confd](https://github.com/kelseyhightower/confd) - Manage local app config files using templates and data from etcd
- [configdb](https://git.autistici.org/ai/configdb/tree/master) - A REST relational abstraction on top of arbitrary database backends, aimed at storing configs and inventories.
- [scrz](https://github.com/scrz/scrz) - Container manager, stores configuration in etcd.
- [fleet](https://github.com/coreos/fleet) - Distributed init system
- [kubernetes/kubernetes](https://github.com/kubernetes/kubernetes) - Container cluster manager introduced by Google.
- [mailgun/vulcand](https://github.com/mailgun/vulcand) - HTTP proxy that uses etcd as a configuration backend.

View File

@ -1,6 +1,6 @@
# Reporting Bugs
If you find bugs or documentation mistakes in the etcd project, please let us know by [opening an issue][issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
If you find bugs or documentation mistakes in the etcd project, please let us know by [opening an issue][etcd-issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
To make your bug report accurate and easy to understand, please try to create bug reports that are:

View File

@ -7,25 +7,25 @@ To prove out the design of the v3 API the team has also built [a number of examp
# Design
1. Flatten binary key-value space
2. Keep the event history until compaction
- access to old version of keys
- user controlled history compaction
3. Support range query
- Pagination support with limit argument
- Support consistency guarantee across multiple range queries
4. Replace TTL key with Lease
- more efficient/ low cost keep alive
- a logical group of TTL keys
5. Replace CAS/CAD with multi-object Txn
- MUCH MORE powerful and flexible
6. Support efficient watching with multiple ranges
7. RPC API supports the completed set of APIs.
7. RPC API supports the completed set of APIs.
- more efficient than JSON/HTTP
- additional txn/lease support
@ -56,7 +56,7 @@ the size in the future a little bit or make it configurable.
// A put is always successful
Put( PutRequest { key = foo, value = bar } )
PutResponse {
PutResponse {
cluster_id = 0x1000,
member_id = 0x1,
revision = 1,
@ -119,7 +119,7 @@ RangeResponse {
Txn(TxnRequest {
// mod_revision of foo0 is equal to 1, mod_revision of foo1 is greater than 1
compare = {
{compareType = equal, key = foo0, mod_revision = 1},
{compareType = equal, key = foo0, mod_revision = 1},
{compareType = greater, key = foo1, mod_revision = 1}}
},
// if the comparison succeeds, put foo2 = bar2
@ -156,7 +156,7 @@ Watch( WatchRequest{
end_revision = 10000,
// server decided notification frequency
progress_notification = true,
}
}
… // this can be a watch request stream
)
@ -176,7 +176,7 @@ WatchResponse {
},
}
// a notification at 2000
WatchResponse {
cluster_id = 0x1000,
@ -185,9 +185,9 @@ WatchResponse {
raft_term = 0x1,
// nil event as notification
}
// put (foo0=bar3000) event at 3000
WatchResponse {
cluster_id = 0x1000,
@ -204,8 +204,8 @@ WatchResponse {
},
}
```
[api-protobuf]: https://github.com/coreos/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto
[kv-protobuf]: https://github.com/coreos/etcd/blob/master/storage/storagepb/kv.proto
[api-protobuf]: https://github.com/coreos/etcd/blob/release-2.3/etcdserver/etcdserverpb/rpc.proto
[kv-protobuf]: https://github.com/coreos/etcd/blob/release-2.3/storage/storagepb/kv.proto

View File

@ -188,6 +188,6 @@ Make sure that you sign your certificates with a Subject Name your member's publ
If you need your certificate to be signed for your member's FQDN in its Subject Name then you could use Subject Alternative Names (short IP SANs) to add your IP address. The `etcd-ca` tool provides `--domain=` option for its `new-cert` command, and openssl can make [it][alt-name] too.
[cfssl]: https://github.com/cloudflare/cfssl
[tls-setup]: /hack/tls-setup
[tls-setup]: ../../hack/tls-setup
[tls-guide]: https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md
[alt-name]: http://wiki.cacert.org/FAQ/subjectAltName

516
Makefile Normal file
View File

@ -0,0 +1,516 @@
# run from repository root
# Example:
# make build
# make clean
# make docker-clean
# make docker-start
# make docker-kill
# make docker-remove
.PHONY: build
build:
GO_BUILD_FLAGS="-v" ./build
./bin/etcd --version
ETCDCTL_API=3 ./bin/etcdctl version
clean:
rm -f ./codecov
rm -rf ./agent-*
rm -rf ./covdir
rm -f ./*.log
rm -f ./bin/Dockerfile-release
rm -rf ./bin/*.etcd
rm -rf ./gopath
rm -rf ./gopath.proto
rm -rf ./release
rm -f ./integration/127.0.0.1:* ./integration/localhost:*
rm -f ./clientv3/integration/127.0.0.1:* ./clientv3/integration/localhost:*
rm -f ./clientv3/ordering/127.0.0.1:* ./clientv3/ordering/localhost:*
docker-clean:
docker images
docker image prune --force
docker-start:
service docker restart
docker-kill:
docker kill `docker ps -q` || true
docker-remove:
docker rm --force `docker ps -a -q` || true
docker rmi --force `docker images -q` || true
GO_VERSION ?= 1.10.1
ETCD_VERSION ?= $(shell git rev-parse --short HEAD || echo "GitNotFound")
TEST_SUFFIX = $(shell date +%s | base64 | head -c 15)
TEST_OPTS ?= PASSES='unit'
TMP_DIR_MOUNT_FLAG = --mount type=tmpfs,destination=/tmp
ifdef HOST_TMP_DIR
TMP_DIR_MOUNT_FLAG = --mount type=bind,source=$(HOST_TMP_DIR),destination=/tmp
endif
# Example:
# GO_VERSION=1.8.7 make build-docker-test
# GO_VERSION=1.9.5 make build-docker-test
# make build-docker-test
#
# gcloud docker -- login -u _json_key -p "$(cat /etc/gcp-key-etcd-development.json)" https://gcr.io
# GO_VERSION=1.8.7 make push-docker-test
# GO_VERSION=1.9.5 make push-docker-test
# make push-docker-test
#
# gsutil -m acl ch -u allUsers:R -r gs://artifacts.etcd-development.appspot.com
# GO_VERSION=1.9.5 make pull-docker-test
# make pull-docker-test
build-docker-test:
$(info GO_VERSION: $(GO_VERSION))
@sed -i.bak 's|REPLACE_ME_GO_VERSION|$(GO_VERSION)|g' ./tests/Dockerfile
docker build \
--tag gcr.io/etcd-development/etcd-test:go$(GO_VERSION) \
--file ./tests/Dockerfile .
@mv ./tests/Dockerfile.bak ./tests/Dockerfile
push-docker-test:
$(info GO_VERSION: $(GO_VERSION))
gcloud docker -- push gcr.io/etcd-development/etcd-test:go$(GO_VERSION)
pull-docker-test:
$(info GO_VERSION: $(GO_VERSION))
docker pull gcr.io/etcd-development/etcd-test:go$(GO_VERSION)
# Example:
# make build-docker-test
# make compile-with-docker-test
# make compile-setup-gopath-with-docker-test
compile-with-docker-test:
$(info GO_VERSION: $(GO_VERSION))
docker run \
--rm \
--mount type=bind,source=`pwd`,destination=/go/src/github.com/coreos/etcd \
gcr.io/etcd-development/etcd-test:go$(GO_VERSION) \
/bin/bash -c "GO_BUILD_FLAGS=-v ./build && ./bin/etcd --version"
compile-setup-gopath-with-docker-test:
$(info GO_VERSION: $(GO_VERSION))
docker run \
--rm \
--mount type=bind,source=`pwd`,destination=/etcd \
gcr.io/etcd-development/etcd-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && ETCD_SETUP_GOPATH=1 GO_BUILD_FLAGS=-v ./build && ./bin/etcd --version && rm -rf ./gopath"
# Example:
#
# Local machine:
# TEST_OPTS="PASSES='fmt'" make test
# TEST_OPTS="PASSES='fmt bom dep compile build unit'" make test
# TEST_OPTS="PASSES='build unit release integration_e2e functional'" make test
# TEST_OPTS="PASSES='build grpcproxy'" make test
#
# Example (test with docker):
# make pull-docker-test
# TEST_OPTS="PASSES='fmt'" make docker-test
# TEST_OPTS="VERBOSE=2 PASSES='unit'" make docker-test
#
# Travis CI (test with docker):
# TEST_OPTS="PASSES='fmt bom dep compile build unit'" make docker-test
#
# Semaphore CI (test with docker):
# TEST_OPTS="PASSES='build unit release integration_e2e functional'" make docker-test
# HOST_TMP_DIR=/tmp TEST_OPTS="PASSES='build unit release integration_e2e functional'" make docker-test
# TEST_OPTS="GOARCH=386 PASSES='build unit integration_e2e'" make docker-test
#
# grpc-proxy tests (test with docker):
# TEST_OPTS="PASSES='build grpcproxy'" make docker-test
# HOST_TMP_DIR=/tmp TEST_OPTS="PASSES='build grpcproxy'" make docker-test
.PHONY: test
test:
$(info TEST_OPTS: $(TEST_OPTS))
$(info log-file: test-$(TEST_SUFFIX).log)
$(TEST_OPTS) ./test 2>&1 | tee test-$(TEST_SUFFIX).log
! egrep "(--- FAIL:|panic: test timed out|appears to have leaked)" -B50 -A10 test-$(TEST_SUFFIX).log
docker-test:
$(info GO_VERSION: $(GO_VERSION))
$(info ETCD_VERSION: $(ETCD_VERSION))
$(info TEST_OPTS: $(TEST_OPTS))
$(info log-file: test-$(TEST_SUFFIX).log)
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`,destination=/go/src/github.com/coreos/etcd \
gcr.io/etcd-development/etcd-test:go$(GO_VERSION) \
/bin/bash -c "$(TEST_OPTS) ./test 2>&1 | tee test-$(TEST_SUFFIX).log"
! egrep "(--- FAIL:|panic: test timed out|appears to have leaked)" -B50 -A10 test-$(TEST_SUFFIX).log
docker-test-coverage:
$(info GO_VERSION: $(GO_VERSION))
$(info ETCD_VERSION: $(ETCD_VERSION))
$(info log-file: docker-test-coverage-$(TEST_SUFFIX).log)
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`,destination=/go/src/github.com/coreos/etcd \
gcr.io/etcd-development/etcd-test:go$(GO_VERSION) \
/bin/bash -c "COVERDIR=covdir PASSES='build build_cov cov' ./test 2>&1 | tee docker-test-coverage-$(TEST_SUFFIX).log && /codecov -t 6040de41-c073-4d6f-bbf8-d89256ef31e1"
! egrep "(--- FAIL:|panic: test timed out|appears to have leaked)" -B50 -A10 docker-test-coverage-$(TEST_SUFFIX).log
# Example:
# make compile-with-docker-test
# ETCD_VERSION=v3-test make build-docker-release-master
# ETCD_VERSION=v3-test make push-docker-release-master
# gsutil -m acl ch -u allUsers:R -r gs://artifacts.etcd-development.appspot.com
build-docker-release-master:
$(info ETCD_VERSION: $(ETCD_VERSION))
cp ./Dockerfile-release ./bin/Dockerfile-release
docker build \
--tag gcr.io/etcd-development/etcd:$(ETCD_VERSION) \
--file ./bin/Dockerfile-release \
./bin
rm -f ./bin/Dockerfile-release
docker run \
--rm \
gcr.io/etcd-development/etcd:$(ETCD_VERSION) \
/bin/sh -c "/usr/local/bin/etcd --version && ETCDCTL_API=3 /usr/local/bin/etcdctl version"
push-docker-release-master:
$(info ETCD_VERSION: $(ETCD_VERSION))
gcloud docker -- push gcr.io/etcd-development/etcd:$(ETCD_VERSION)
# Example:
# make build-docker-test
# make compile-with-docker-test
# make build-docker-static-ip-test
#
# gcloud docker -- login -u _json_key -p "$(cat /etc/gcp-key-etcd-development.json)" https://gcr.io
# make push-docker-static-ip-test
#
# gsutil -m acl ch -u allUsers:R -r gs://artifacts.etcd-development.appspot.com
# make pull-docker-static-ip-test
#
# make docker-static-ip-test-certs-run
# make docker-static-ip-test-certs-metrics-proxy-run
build-docker-static-ip-test:
$(info GO_VERSION: $(GO_VERSION))
@sed -i.bak 's|REPLACE_ME_GO_VERSION|$(GO_VERSION)|g' ./tests/docker-static-ip/Dockerfile
docker build \
--tag gcr.io/etcd-development/etcd-static-ip-test:go$(GO_VERSION) \
--file ./tests/docker-static-ip/Dockerfile \
./tests/docker-static-ip
@mv ./tests/docker-static-ip/Dockerfile.bak ./tests/docker-static-ip/Dockerfile
push-docker-static-ip-test:
$(info GO_VERSION: $(GO_VERSION))
gcloud docker -- push gcr.io/etcd-development/etcd-static-ip-test:go$(GO_VERSION)
pull-docker-static-ip-test:
$(info GO_VERSION: $(GO_VERSION))
docker pull gcr.io/etcd-development/etcd-static-ip-test:go$(GO_VERSION)
docker-static-ip-test-certs-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-static-ip/certs,destination=/certs \
gcr.io/etcd-development/etcd-static-ip-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /certs/run.sh && rm -rf m*.etcd"
docker-static-ip-test-certs-metrics-proxy-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-static-ip/certs-metrics-proxy,destination=/certs-metrics-proxy \
gcr.io/etcd-development/etcd-static-ip-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /certs-metrics-proxy/run.sh && rm -rf m*.etcd"
# Example:
# make build-docker-test
# make compile-with-docker-test
# make build-docker-dns-test
#
# gcloud docker -- login -u _json_key -p "$(cat /etc/gcp-key-etcd-development.json)" https://gcr.io
# make push-docker-dns-test
#
# gsutil -m acl ch -u allUsers:R -r gs://artifacts.etcd-development.appspot.com
# make pull-docker-dns-test
#
# make docker-dns-test-insecure-run
# make docker-dns-test-certs-run
# make docker-dns-test-certs-gateway-run
# make docker-dns-test-certs-wildcard-run
# make docker-dns-test-certs-common-name-auth-run
# make docker-dns-test-certs-common-name-multi-run
build-docker-dns-test:
$(info GO_VERSION: $(GO_VERSION))
@sed -i.bak 's|REPLACE_ME_GO_VERSION|$(GO_VERSION)|g' ./tests/docker-dns/Dockerfile
docker build \
--tag gcr.io/etcd-development/etcd-dns-test:go$(GO_VERSION) \
--file ./tests/docker-dns/Dockerfile \
./tests/docker-dns
@mv ./tests/docker-dns/Dockerfile.bak ./tests/docker-dns/Dockerfile
docker run \
--rm \
--dns 127.0.0.1 \
gcr.io/etcd-development/etcd-dns-test:go$(GO_VERSION) \
/bin/bash -c "/etc/init.d/bind9 start && cat /dev/null >/etc/hosts && dig etcd.local"
push-docker-dns-test:
$(info GO_VERSION: $(GO_VERSION))
gcloud docker -- push gcr.io/etcd-development/etcd-dns-test:go$(GO_VERSION)
pull-docker-dns-test:
$(info GO_VERSION: $(GO_VERSION))
docker pull gcr.io/etcd-development/etcd-dns-test:go$(GO_VERSION)
docker-dns-test-insecure-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
--dns 127.0.0.1 \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-dns/insecure,destination=/insecure \
gcr.io/etcd-development/etcd-dns-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /insecure/run.sh && rm -rf m*.etcd"
docker-dns-test-certs-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
--dns 127.0.0.1 \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-dns/certs,destination=/certs \
gcr.io/etcd-development/etcd-dns-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /certs/run.sh && rm -rf m*.etcd"
docker-dns-test-certs-gateway-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
--dns 127.0.0.1 \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-dns/certs-gateway,destination=/certs-gateway \
gcr.io/etcd-development/etcd-dns-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /certs-gateway/run.sh && rm -rf m*.etcd"
docker-dns-test-certs-wildcard-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
--dns 127.0.0.1 \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-dns/certs-wildcard,destination=/certs-wildcard \
gcr.io/etcd-development/etcd-dns-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /certs-wildcard/run.sh && rm -rf m*.etcd"
docker-dns-test-certs-common-name-auth-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
--dns 127.0.0.1 \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-dns/certs-common-name-auth,destination=/certs-common-name-auth \
gcr.io/etcd-development/etcd-dns-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /certs-common-name-auth/run.sh && rm -rf m*.etcd"
docker-dns-test-certs-common-name-multi-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
--dns 127.0.0.1 \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-dns/certs-common-name-multi,destination=/certs-common-name-multi \
gcr.io/etcd-development/etcd-dns-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /certs-common-name-multi/run.sh && rm -rf m*.etcd"
# Example:
# make build-docker-test
# make compile-with-docker-test
# make build-docker-dns-srv-test
# gcloud docker -- login -u _json_key -p "$(cat /etc/gcp-key-etcd-development.json)" https://gcr.io
# make push-docker-dns-srv-test
# gsutil -m acl ch -u allUsers:R -r gs://artifacts.etcd-development.appspot.com
# make pull-docker-dns-srv-test
# make docker-dns-srv-test-certs-run
# make docker-dns-srv-test-certs-gateway-run
# make docker-dns-srv-test-certs-wildcard-run
build-docker-dns-srv-test:
$(info GO_VERSION: $(GO_VERSION))
@sed -i.bak 's|REPLACE_ME_GO_VERSION|$(GO_VERSION)|g' ./tests/docker-dns-srv/Dockerfile
docker build \
--tag gcr.io/etcd-development/etcd-dns-srv-test:go$(GO_VERSION) \
--file ./tests/docker-dns-srv/Dockerfile \
./tests/docker-dns-srv
@mv ./tests/docker-dns-srv/Dockerfile.bak ./tests/docker-dns-srv/Dockerfile
docker run \
--rm \
--dns 127.0.0.1 \
gcr.io/etcd-development/etcd-dns-srv-test:go$(GO_VERSION) \
/bin/bash -c "/etc/init.d/bind9 start && cat /dev/null >/etc/hosts && dig +noall +answer SRV _etcd-client-ssl._tcp.etcd.local && dig +noall +answer SRV _etcd-server-ssl._tcp.etcd.local && dig +noall +answer m1.etcd.local m2.etcd.local m3.etcd.local"
push-docker-dns-srv-test:
$(info GO_VERSION: $(GO_VERSION))
gcloud docker -- push gcr.io/etcd-development/etcd-dns-srv-test:go$(GO_VERSION)
pull-docker-dns-srv-test:
$(info GO_VERSION: $(GO_VERSION))
docker pull gcr.io/etcd-development/etcd-dns-srv-test:go$(GO_VERSION)
docker-dns-srv-test-certs-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
--dns 127.0.0.1 \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-dns-srv/certs,destination=/certs \
gcr.io/etcd-development/etcd-dns-srv-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /certs/run.sh && rm -rf m*.etcd"
docker-dns-srv-test-certs-gateway-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
--dns 127.0.0.1 \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-dns-srv/certs-gateway,destination=/certs-gateway \
gcr.io/etcd-development/etcd-dns-srv-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /certs-gateway/run.sh && rm -rf m*.etcd"
docker-dns-srv-test-certs-wildcard-run:
$(info GO_VERSION: $(GO_VERSION))
$(info HOST_TMP_DIR: $(HOST_TMP_DIR))
$(info TMP_DIR_MOUNT_FLAG: $(TMP_DIR_MOUNT_FLAG))
docker run \
--rm \
--tty \
--dns 127.0.0.1 \
$(TMP_DIR_MOUNT_FLAG) \
--mount type=bind,source=`pwd`/bin,destination=/etcd \
--mount type=bind,source=`pwd`/tests/docker-dns-srv/certs-wildcard,destination=/certs-wildcard \
gcr.io/etcd-development/etcd-dns-srv-test:go$(GO_VERSION) \
/bin/bash -c "cd /etcd && /certs-wildcard/run.sh && rm -rf m*.etcd"
# Example:
# make build-functional
# make build-docker-functional
# make push-docker-functional
# make pull-docker-functional
build-functional:
$(info GO_VERSION: $(GO_VERSION))
$(info ETCD_VERSION: $(ETCD_VERSION))
./functional/build
./bin/etcd-agent -help || true && \
./bin/etcd-proxy -help || true && \
./bin/etcd-runner --help || true && \
./bin/etcd-tester -help || true
build-docker-functional:
$(info GO_VERSION: $(GO_VERSION))
$(info ETCD_VERSION: $(ETCD_VERSION))
@sed -i.bak 's|REPLACE_ME_GO_VERSION|$(GO_VERSION)|g' ./functional/Dockerfile
docker build \
--tag gcr.io/etcd-development/etcd-functional:go$(GO_VERSION) \
--file ./functional/Dockerfile \
.
@mv ./functional/Dockerfile.bak ./functional/Dockerfile
docker run \
--rm \
gcr.io/etcd-development/etcd-functional:go$(GO_VERSION) \
/bin/bash -c "./bin/etcd --version && \
./bin/etcd-failpoints --version && \
ETCDCTL_API=3 ./bin/etcdctl version && \
./bin/etcd-agent -help || true && \
./bin/etcd-proxy -help || true && \
./bin/etcd-runner --help || true && \
./bin/etcd-tester -help || true && \
./bin/benchmark --help || true"
push-docker-functional:
$(info GO_VERSION: $(GO_VERSION))
$(info ETCD_VERSION: $(ETCD_VERSION))
gcloud docker -- push gcr.io/etcd-development/etcd-functional:go$(GO_VERSION)
pull-docker-functional:
$(info GO_VERSION: $(GO_VERSION))
$(info ETCD_VERSION: $(ETCD_VERSION))
docker pull gcr.io/etcd-development/etcd-functional:go$(GO_VERSION)

2
NEWS
View File

@ -1,4 +1,4 @@
etcd v3.1.0 (2017-01-13)
etcd v3.1.0 (2017-01-20)
- faster linearizable reads (implements Raft read-index)
- automatic leadership transfer when leader steps down
- etcd uses default route IP if advertise URL is not given

View File

@ -37,13 +37,14 @@ See [etcdctl][etcdctl] for a simple command line client.
### Getting etcd
The easiest way to get etcd is to use one of the pre-built release binaries which are available for OSX, Linux, Windows, AppC (ACI), and Docker. Instructions for using these binaries are on the [GitHub releases page][github-release].
The easiest way to get etcd is to use one of the pre-built release binaries which are available for OSX, Linux, Windows, [rkt][rkt], and Docker. Instructions for using these binaries are on the [GitHub releases page][github-release].
For those wanting to try the very latest version, you can [build the latest version of etcd][dl-build] from the `master` branch.
You will first need [*Go*](https://golang.org/) installed on your machine (version 1.6+ is required).
You will first need [*Go*](https://golang.org/) installed on your machine (version 1.7+ is required).
All development occurs on `master`, including new features and bug fixes.
Bug fixes are first targeted at `master` and subsequently ported to release branches, as described in the [branch management][branch-management] guide.
[rkt]: https://github.com/coreos/rkt/releases/
[github-release]: https://github.com/coreos/etcd/releases/
[branch-management]: ./Documentation/branch_management.md
[dl-build]: ./Documentation/dl_build.md#build-the-latest-version
@ -77,7 +78,7 @@ That's it! etcd is now running and serving client requests. For more
The [official etcd ports][iana-ports] are 2379 for client requests, and 2380 for peer communication.
[iana-ports]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=etcd
[iana-ports]: http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt
### Running a local etcd cluster
@ -135,5 +136,3 @@ See [reporting bugs](Documentation/reporting_bugs.md) for details about reportin
### License
etcd is under the Apache 2.0 license. See the [LICENSE](LICENSE) file for details.

View File

@ -6,25 +6,17 @@ This document defines a high level roadmap for etcd development.
The dates below should not be considered authoritative, but rather indicative of the projected timeline of the project. The [milestones defined in GitHub](https://github.com/coreos/etcd/milestones) represent the most up-to-date and issue-for-issue plans.
etcd 3.0 is our current stable branch. The roadmap below outlines new features that will be added to etcd, and while subject to change, define what future stable will look like.
etcd 3.1 is our current stable branch. The roadmap below outlines new features that will be added to etcd, and while subject to change, define what future stable will look like.
### etcd 3.1 (2016-Oct)
- Stable L4 gateway
- Experimental support for scalable proxy
- Automatic leadership transfer for the rolling upgrade
- V3 API improvements
- Get previous key-value pair
- Get only keys (ignore values)
- Get only key count
### etcd 3.2 (2017-Apr)
### etcd 3.2 (2017-May)
- Stable scalable proxy
- Proxy-as-client interface passthrough
- Lock service
- Namespacing proxy
- JWT token based authentication
- TLS Command Name and JWT token based authentication
- Read-modify-write V3 Put
- Improved watch performance
- Support non-blocking concurrent read
### etcd 3.3 (?)
- TBD

View File

@ -21,85 +21,92 @@ import (
"crypto/rand"
"math/big"
"strings"
"sync"
"time"
)
const (
letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
defaultSimpleTokenLength = 16
)
// var for testing purposes
var (
simpleTokenTTL = 5 * time.Minute
simpleTokenTTLResolution = 1 * time.Second
)
type simpleTokenTTLKeeper struct {
tokens map[string]time.Time
addSimpleTokenCh chan string
resetSimpleTokenCh chan string
deleteSimpleTokenCh chan string
stopCh chan chan struct{}
deleteTokenFunc func(string)
}
func NewSimpleTokenTTLKeeper(deletefunc func(string)) *simpleTokenTTLKeeper {
stk := &simpleTokenTTLKeeper{
tokens: make(map[string]time.Time),
addSimpleTokenCh: make(chan string, 1),
resetSimpleTokenCh: make(chan string, 1),
deleteSimpleTokenCh: make(chan string, 1),
stopCh: make(chan chan struct{}),
deleteTokenFunc: deletefunc,
}
go stk.run()
return stk
tokens map[string]time.Time
donec chan struct{}
stopc chan struct{}
deleteTokenFunc func(string)
mu *sync.Mutex
}
func (tm *simpleTokenTTLKeeper) stop() {
waitCh := make(chan struct{})
tm.stopCh <- waitCh
<-waitCh
close(tm.stopCh)
select {
case tm.stopc <- struct{}{}:
case <-tm.donec:
}
<-tm.donec
}
func (tm *simpleTokenTTLKeeper) addSimpleToken(token string) {
tm.addSimpleTokenCh <- token
tm.tokens[token] = time.Now().Add(simpleTokenTTL)
}
func (tm *simpleTokenTTLKeeper) resetSimpleToken(token string) {
tm.resetSimpleTokenCh <- token
if _, ok := tm.tokens[token]; ok {
tm.tokens[token] = time.Now().Add(simpleTokenTTL)
}
}
func (tm *simpleTokenTTLKeeper) deleteSimpleToken(token string) {
tm.deleteSimpleTokenCh <- token
delete(tm.tokens, token)
}
func (tm *simpleTokenTTLKeeper) run() {
tokenTicker := time.NewTicker(simpleTokenTTLResolution)
defer tokenTicker.Stop()
defer func() {
tokenTicker.Stop()
close(tm.donec)
}()
for {
select {
case t := <-tm.addSimpleTokenCh:
tm.tokens[t] = time.Now().Add(simpleTokenTTL)
case t := <-tm.resetSimpleTokenCh:
if _, ok := tm.tokens[t]; ok {
tm.tokens[t] = time.Now().Add(simpleTokenTTL)
}
case t := <-tm.deleteSimpleTokenCh:
delete(tm.tokens, t)
case <-tokenTicker.C:
nowtime := time.Now()
tm.mu.Lock()
for t, tokenendtime := range tm.tokens {
if nowtime.After(tokenendtime) {
tm.deleteTokenFunc(t)
delete(tm.tokens, t)
}
}
case waitCh := <-tm.stopCh:
tm.tokens = make(map[string]time.Time)
waitCh <- struct{}{}
tm.mu.Unlock()
case <-tm.stopc:
return
}
}
}
func (as *authStore) enable() {
delf := func(tk string) {
if username, ok := as.simpleTokens[tk]; ok {
plog.Infof("deleting token %s for user %s", tk, username)
delete(as.simpleTokens, tk)
}
}
as.simpleTokenKeeper = &simpleTokenTTLKeeper{
tokens: make(map[string]time.Time),
donec: make(chan struct{}),
stopc: make(chan struct{}),
deleteTokenFunc: delf,
mu: &as.simpleTokensMu,
}
go as.simpleTokenKeeper.run()
}
func (as *authStore) GenSimpleToken() (string, error) {
ret := make([]byte, defaultSimpleTokenLength)
@ -117,7 +124,6 @@ func (as *authStore) GenSimpleToken() (string, error) {
func (as *authStore) assignSimpleTokenToUser(username, token string) {
as.simpleTokensMu.Lock()
_, ok := as.simpleTokens[token]
if ok {
plog.Panicf("token %s is alredy used", token)
@ -129,13 +135,15 @@ func (as *authStore) assignSimpleTokenToUser(username, token string) {
}
func (as *authStore) invalidateUser(username string) {
if as.simpleTokenKeeper == nil {
return
}
as.simpleTokensMu.Lock()
defer as.simpleTokensMu.Unlock()
for token, name := range as.simpleTokens {
if strings.Compare(name, username) == 0 {
delete(as.simpleTokens, token)
as.simpleTokenKeeper.deleteSimpleToken(token)
}
}
as.simpleTokensMu.Unlock()
}

View File

@ -20,6 +20,7 @@ import (
"errors"
"fmt"
"sort"
"strconv"
"strings"
"sync"
@ -29,6 +30,7 @@ import (
"github.com/coreos/pkg/capnslog"
"golang.org/x/crypto/bcrypt"
"golang.org/x/net/context"
"google.golang.org/grpc/metadata"
)
var (
@ -57,6 +59,7 @@ var (
ErrPermissionNotGranted = errors.New("auth: permission is not granted to the role")
ErrAuthNotEnabled = errors.New("auth: authentication is not enabled")
ErrAuthOldRevision = errors.New("auth: revision in header is old")
ErrInvalidAuthToken = errors.New("auth: invalid auth token")
// BcryptCost is the algorithm cost / strength for hashing auth passwords
BcryptCost = bcrypt.DefaultCost
@ -153,6 +156,9 @@ type AuthStore interface {
// Close does cleanup of AuthStore
Close() error
// AuthInfoFromCtx gets AuthInfo from gRPC's context
AuthInfoFromCtx(ctx context.Context) (*AuthInfo, error)
}
type authStore struct {
@ -162,11 +168,24 @@ type authStore struct {
rangePermCache map[string]*unifiedRangePermissions // username -> unifiedRangePermissions
simpleTokensMu sync.RWMutex
simpleTokens map[string]string // token -> username
simpleTokenKeeper *simpleTokenTTLKeeper
revision uint64
// tokenSimple in v3.2+
indexWaiter func(uint64) <-chan struct{}
simpleTokenKeeper *simpleTokenTTLKeeper
simpleTokensMu sync.Mutex
simpleTokens map[string]string // token -> username
}
func newDeleterFunc(as *authStore) func(string) {
return func(t string) {
as.simpleTokensMu.Lock()
defer as.simpleTokensMu.Unlock()
if username, ok := as.simpleTokens[t]; ok {
plog.Infof("deleting token %s for user %s", t, username)
delete(as.simpleTokens, t)
}
}
}
func (as *authStore) AuthEnable() error {
@ -196,16 +215,7 @@ func (as *authStore) AuthEnable() error {
tx.UnsafePut(authBucketName, enableFlagKey, authEnabled)
as.enabled = true
tokenDeleteFunc := func(t string) {
as.simpleTokensMu.Lock()
defer as.simpleTokensMu.Unlock()
if username, ok := as.simpleTokens[t]; ok {
plog.Infof("deleting token %s for user %s", t, username)
delete(as.simpleTokens, t)
}
}
as.simpleTokenKeeper = NewSimpleTokenTTLKeeper(tokenDeleteFunc)
as.enable()
as.rangePermCache = make(map[string]*unifiedRangePermissions)
@ -233,11 +243,12 @@ func (as *authStore) AuthDisable() {
as.enabled = false
as.simpleTokensMu.Lock()
tk := as.simpleTokenKeeper
as.simpleTokenKeeper = nil
as.simpleTokens = make(map[string]string) // invalidate all tokens
as.simpleTokensMu.Unlock()
if as.simpleTokenKeeper != nil {
as.simpleTokenKeeper.stop()
as.simpleTokenKeeper = nil
if tk != nil {
tk.stop()
}
plog.Noticef("Authentication disabled")
@ -635,13 +646,14 @@ func (as *authStore) RoleAdd(r *pb.AuthRoleAddRequest) (*pb.AuthRoleAddResponse,
}
func (as *authStore) AuthInfoFromToken(token string) (*AuthInfo, bool) {
as.simpleTokensMu.RLock()
defer as.simpleTokensMu.RUnlock()
t, ok := as.simpleTokens[token]
if ok {
// same as '(t *tokenSimple) info' in v3.2+
as.simpleTokensMu.Lock()
username, ok := as.simpleTokens[token]
if ok && as.simpleTokenKeeper != nil {
as.simpleTokenKeeper.resetSimpleToken(token)
}
return &AuthInfo{Username: t, Revision: as.revision}, ok
as.simpleTokensMu.Unlock()
return &AuthInfo{Username: username, Revision: as.revision}, ok
}
type permSlice []*authpb.Permission
@ -753,6 +765,9 @@ func (as *authStore) IsAdminPermitted(authInfo *AuthInfo) error {
if !as.isAuthEnabled() {
return nil
}
if authInfo == nil {
return ErrUserEmpty
}
tx := as.be.BatchTx()
tx.Lock()
@ -871,7 +886,7 @@ func (as *authStore) isAuthEnabled() bool {
return as.enabled
}
func NewAuthStore(be backend.Backend) *authStore {
func NewAuthStore(be backend.Backend, indexWaiter func(uint64) <-chan struct{}) *authStore {
tx := be.BatchTx()
tx.Lock()
@ -879,13 +894,30 @@ func NewAuthStore(be backend.Backend) *authStore {
tx.UnsafeCreateBucket(authUsersBucketName)
tx.UnsafeCreateBucket(authRolesBucketName)
as := &authStore{
be: be,
simpleTokens: make(map[string]string),
revision: 0,
enabled := false
_, vs := tx.UnsafeRange(authBucketName, enableFlagKey, nil, 0)
if len(vs) == 1 {
if bytes.Equal(vs[0], authEnabled) {
enabled = true
}
}
as.commitRevision(tx)
as := &authStore{
be: be,
simpleTokens: make(map[string]string),
revision: getRevision(tx),
indexWaiter: indexWaiter,
enabled: enabled,
rangePermCache: make(map[string]*unifiedRangePermissions),
}
if enabled {
as.enable()
}
if as.revision == 0 {
as.commitRevision(tx)
}
tx.Unlock()
be.ForceCommit()
@ -912,7 +944,8 @@ func (as *authStore) commitRevision(tx backend.BatchTx) {
func getRevision(tx backend.BatchTx) uint64 {
_, vs := tx.UnsafeRange(authBucketName, []byte(revisionKey), nil, 0)
if len(vs) != 1 {
plog.Panicf("failed to get the key of auth store revision")
// this can happen in the initialization phase
return 0
}
return binary.BigEndian.Uint64(vs[0])
@ -921,3 +954,46 @@ func getRevision(tx backend.BatchTx) uint64 {
func (as *authStore) Revision() uint64 {
return as.revision
}
func (as *authStore) isValidSimpleToken(token string, ctx context.Context) bool {
splitted := strings.Split(token, ".")
if len(splitted) != 2 {
return false
}
index, err := strconv.Atoi(splitted[1])
if err != nil {
return false
}
select {
case <-as.indexWaiter(uint64(index)):
return true
case <-ctx.Done():
}
return false
}
func (as *authStore) AuthInfoFromCtx(ctx context.Context) (*AuthInfo, error) {
md, ok := metadata.FromContext(ctx)
if !ok {
return nil, nil
}
ts, tok := md["token"]
if !tok {
return nil, nil
}
token := ts[0]
if !as.isValidSimpleToken(token, ctx) {
return nil, ErrInvalidAuthToken
}
authInfo, uok := as.AuthInfoFromToken(token)
if !uok {
plog.Warningf("invalid auth token: %s", token)
return nil, ErrInvalidAuthToken
}
return authInfo, nil
}

View File

@ -26,31 +26,38 @@ import (
func init() { BcryptCost = bcrypt.MinCost }
func TestUserAdd(t *testing.T) {
b, tPath := backend.NewDefaultTmpBackend()
defer func() {
b.Close()
os.Remove(tPath)
func dummyIndexWaiter(index uint64) <-chan struct{} {
ch := make(chan struct{})
go func() {
ch <- struct{}{}
}()
return ch
}
as := NewAuthStore(b)
ua := &pb.AuthUserAddRequest{Name: "foo"}
_, err := as.UserAdd(ua) // add a non-existing user
// TestNewAuthStoreRevision ensures newly auth store
// keeps the old revision when there are no changes.
func TestNewAuthStoreRevision(t *testing.T) {
b, tPath := backend.NewDefaultTmpBackend()
defer os.Remove(tPath)
as := NewAuthStore(b, dummyIndexWaiter)
err := enableAuthAndCreateRoot(as)
if err != nil {
t.Fatal(err)
}
_, err = as.UserAdd(ua) // add an existing user
if err == nil {
t.Fatalf("expected %v, got %v", ErrUserAlreadyExist, err)
}
if err != ErrUserAlreadyExist {
t.Fatalf("expected %v, got %v", ErrUserAlreadyExist, err)
}
old := as.Revision()
b.Close()
as.Close()
ua = &pb.AuthUserAddRequest{Name: ""}
_, err = as.UserAdd(ua) // add a user with empty name
if err != ErrUserEmpty {
t.Fatal(err)
// no changes to commit
b2 := backend.NewDefaultBackend(tPath)
as = NewAuthStore(b2, dummyIndexWaiter)
new := as.Revision()
b2.Close()
as.Close()
if old != new {
t.Fatalf("expected revision %d, got %d", old, new)
}
}
@ -80,7 +87,7 @@ func TestCheckPassword(t *testing.T) {
os.Remove(tPath)
}()
as := NewAuthStore(b)
as := NewAuthStore(b, dummyIndexWaiter)
defer as.Close()
err := enableAuthAndCreateRoot(as)
if err != nil {
@ -125,7 +132,7 @@ func TestUserDelete(t *testing.T) {
os.Remove(tPath)
}()
as := NewAuthStore(b)
as := NewAuthStore(b, dummyIndexWaiter)
defer as.Close()
err := enableAuthAndCreateRoot(as)
if err != nil {
@ -162,7 +169,7 @@ func TestUserChangePassword(t *testing.T) {
os.Remove(tPath)
}()
as := NewAuthStore(b)
as := NewAuthStore(b, dummyIndexWaiter)
defer as.Close()
err := enableAuthAndCreateRoot(as)
if err != nil {
@ -208,7 +215,7 @@ func TestRoleAdd(t *testing.T) {
os.Remove(tPath)
}()
as := NewAuthStore(b)
as := NewAuthStore(b, dummyIndexWaiter)
defer as.Close()
err := enableAuthAndCreateRoot(as)
if err != nil {
@ -229,7 +236,7 @@ func TestUserGrant(t *testing.T) {
os.Remove(tPath)
}()
as := NewAuthStore(b)
as := NewAuthStore(b, dummyIndexWaiter)
defer as.Close()
err := enableAuthAndCreateRoot(as)
if err != nil {
@ -261,4 +268,93 @@ func TestUserGrant(t *testing.T) {
if err != ErrUserNotFound {
t.Fatalf("expected %v, got %v", ErrUserNotFound, err)
}
// non-admin user
err = as.IsAdminPermitted(&AuthInfo{Username: "foo", Revision: 1})
if err != ErrPermissionDenied {
t.Errorf("expected %v, got %v", ErrPermissionDenied, err)
}
// disabled auth should return nil
as.AuthDisable()
err = as.IsAdminPermitted(&AuthInfo{Username: "root", Revision: 1})
if err != nil {
t.Errorf("expected nil, got %v", err)
}
}
func TestRecoverFromSnapshot(t *testing.T) {
as, _ := setupAuthStore(t)
ua := &pb.AuthUserAddRequest{Name: "foo"}
_, err := as.UserAdd(ua) // add an existing user
if err == nil {
t.Fatalf("expected %v, got %v", ErrUserAlreadyExist, err)
}
if err != ErrUserAlreadyExist {
t.Fatalf("expected %v, got %v", ErrUserAlreadyExist, err)
}
ua = &pb.AuthUserAddRequest{Name: ""}
_, err = as.UserAdd(ua) // add a user with empty name
if err != ErrUserEmpty {
t.Fatal(err)
}
as.Close()
as2 := NewAuthStore(as.be, dummyIndexWaiter)
defer func(a *authStore) {
a.Close()
}(as2)
if !as2.isAuthEnabled() {
t.Fatal("recovering authStore from existing backend failed")
}
ul, err := as.UserList(&pb.AuthUserListRequest{})
if err != nil {
t.Fatal(err)
}
if !contains(ul.Users, "root") {
t.Errorf("expected %v in %v", "root", ul.Users)
}
}
func contains(array []string, str string) bool {
for _, s := range array {
if s == str {
return true
}
}
return false
}
func setupAuthStore(t *testing.T) (store *authStore, teardownfunc func(t *testing.T)) {
b, tPath := backend.NewDefaultTmpBackend()
as := NewAuthStore(b, dummyIndexWaiter)
err := enableAuthAndCreateRoot(as)
if err != nil {
t.Fatal(err)
}
// adds a new role
_, err = as.RoleAdd(&pb.AuthRoleAddRequest{Name: "role-test"})
if err != nil {
t.Fatal(err)
}
ua := &pb.AuthUserAddRequest{Name: "foo", Password: "bar"}
_, err = as.UserAdd(ua) // add a non-existing user
if err != nil {
t.Fatal(err)
}
tearDown := func(t *testing.T) {
b.Close()
os.Remove(tPath)
as.Close()
}
return as, tearDown
}

View File

@ -34,7 +34,7 @@ import (
func TestV2NoRetryEOF(t *testing.T) {
defer testutil.AfterTest(t)
// generate an EOF response; specify address so appears first in sorted ep list
lEOF := integration.NewListenerWithAddr(t, fmt.Sprintf("eof:123.%d.sock", os.Getpid()))
lEOF := integration.NewListenerWithAddr(t, fmt.Sprintf("127.0.0.1:%05d", os.Getpid()))
defer lEOF.Close()
tries := uint32(0)
go func() {
@ -65,8 +65,7 @@ func TestV2NoRetryEOF(t *testing.T) {
// TestV2NoRetryNoLeader tests destructive api calls won't retry if given an error code.
func TestV2NoRetryNoLeader(t *testing.T) {
defer testutil.AfterTest(t)
lHttp := integration.NewListenerWithAddr(t, fmt.Sprintf("errHttp:123.%d.sock", os.Getpid()))
lHttp := integration.NewListenerWithAddr(t, fmt.Sprintf("127.0.0.1:%05d", os.Getpid()))
eh := &errHandler{errCode: http.StatusServiceUnavailable}
srv := httptest.NewUnstartedServer(eh)
defer lHttp.Close()

View File

@ -221,7 +221,8 @@ func (c *Client) dialSetupOpts(endpoint string, dopts ...grpc.DialOption) (opts
return nil, c.ctx.Err()
default:
}
return net.DialTimeout(proto, host, t)
dialer := &net.Dialer{Timeout: t}
return dialer.DialContext(c.ctx, proto, host)
}
opts = append(opts, grpc.WithDialer(f))
@ -281,8 +282,16 @@ func (c *Client) dial(endpoint string, dopts ...grpc.DialOption) (*grpc.ClientCo
tokenMu: &sync.RWMutex{},
}
err := c.getToken(context.TODO())
if err != nil {
ctx := c.ctx
if c.cfg.DialTimeout > 0 {
cctx, cancel := context.WithTimeout(ctx, c.cfg.DialTimeout)
defer cancel()
ctx = cctx
}
if err := c.getToken(ctx); err != nil {
if err == ctx.Err() && ctx.Err() != c.ctx.Err() {
err = grpc.ErrClientConnTimeout
}
return nil, err
}
@ -334,6 +343,8 @@ func newClient(cfg *Config) (*Client, error) {
client.balancer = newSimpleBalancer(cfg.Endpoints)
conn, err := client.dial(cfg.Endpoints[0], grpc.WithBalancer(client.balancer))
if err != nil {
client.cancel()
client.balancer.Close()
return nil, err
}
client.conn = conn
@ -352,6 +363,7 @@ func newClient(cfg *Config) (*Client, error) {
}
if !hasConn {
client.cancel()
client.balancer.Close()
conn.Close()
return nil, grpc.ErrClientConnTimeout
}

View File

@ -16,6 +16,7 @@ package clientv3
import (
"fmt"
"net"
"testing"
"time"
@ -25,36 +26,89 @@ import (
"google.golang.org/grpc"
)
func TestDialTimeout(t *testing.T) {
func TestDialCancel(t *testing.T) {
defer testutil.AfterTest(t)
donec := make(chan error)
go func() {
// without timeout, grpc keeps redialing if connection refused
cfg := Config{
Endpoints: []string{"localhost:12345"},
DialTimeout: 2 * time.Second}
c, err := New(cfg)
if c != nil || err == nil {
t.Errorf("new client should fail")
}
donec <- err
}()
time.Sleep(10 * time.Millisecond)
select {
case err := <-donec:
t.Errorf("dial didn't wait (%v)", err)
default:
// accept first connection so client is created with dial timeout
ln, err := net.Listen("unix", "dialcancel:12345")
if err != nil {
t.Fatal(err)
}
defer ln.Close()
ep := "unix://dialcancel:12345"
cfg := Config{
Endpoints: []string{ep},
DialTimeout: 30 * time.Second}
c, err := New(cfg)
if err != nil {
t.Fatal(err)
}
// connect to ipv4 blackhole so dial blocks
c.SetEndpoints("http://254.0.0.1:12345")
// issue Get to force redial attempts
go c.Get(context.TODO(), "abc")
// wait a little bit so client close is after dial starts
time.Sleep(100 * time.Millisecond)
donec := make(chan struct{})
go func() {
defer close(donec)
c.Close()
}()
select {
case <-time.After(5 * time.Second):
t.Errorf("failed to timeout dial on time")
case err := <-donec:
if err != grpc.ErrClientConnTimeout {
t.Errorf("unexpected error %v, want %v", err, grpc.ErrClientConnTimeout)
t.Fatalf("failed to close")
case <-donec:
}
}
func TestDialTimeout(t *testing.T) {
defer testutil.AfterTest(t)
testCfgs := []Config{
{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: 2 * time.Second,
},
{
Endpoints: []string{"http://254.0.0.1:12345"},
DialTimeout: time.Second,
Username: "abc",
Password: "def",
},
}
for i, cfg := range testCfgs {
donec := make(chan error)
go func() {
// without timeout, dial continues forever on ipv4 blackhole
c, err := New(cfg)
if c != nil || err == nil {
t.Errorf("#%d: new client should fail", i)
}
donec <- err
}()
time.Sleep(10 * time.Millisecond)
select {
case err := <-donec:
t.Errorf("#%d: dial didn't wait (%v)", i, err)
default:
}
select {
case <-time.After(5 * time.Second):
t.Errorf("#%d: failed to timeout dial on time", i)
case err := <-donec:
if err != grpc.ErrClientConnTimeout {
t.Errorf("#%d: unexpected error %v, want %v", i, err, grpc.ErrClientConnTimeout)
}
}
}
}

View File

@ -249,11 +249,10 @@ func (s *stmReadCommitted) commit() *v3.TxnResponse {
}
func isKeyCurrent(k string, r *v3.GetResponse) v3.Cmp {
rev := r.Header.Revision + 1
if len(r.Kvs) != 0 {
rev = r.Kvs[0].ModRevision + 1
return v3.Compare(v3.ModRevision(k), "=", r.Kvs[0].ModRevision)
}
return v3.Compare(v3.ModRevision(k), "<", rev)
return v3.Compare(v3.ModRevision(k), "=", 0)
}
func respToValue(resp *v3.GetResponse) string {

View File

@ -156,6 +156,30 @@ func TestLeaseKeepAlive(t *testing.T) {
}
}
func TestLeaseKeepAliveOneSecond(t *testing.T) {
defer testutil.AfterTest(t)
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
defer clus.Terminate(t)
cli := clus.Client(0)
resp, err := cli.Grant(context.Background(), 1)
if err != nil {
t.Errorf("failed to create lease %v", err)
}
rc, kerr := cli.KeepAlive(context.Background(), resp.ID)
if kerr != nil {
t.Errorf("failed to keepalive lease %v", kerr)
}
for i := 0; i < 3; i++ {
if _, ok := <-rc; !ok {
t.Errorf("chan is closed, want not closed")
}
}
}
// TODO: add a client that can connect to all the members of cluster via unix sock.
// TODO: test handle more complicated failures.
func TestLeaseKeepAliveHandleFailure(t *testing.T) {

View File

@ -17,5 +17,5 @@ package integration
import "github.com/coreos/pkg/capnslog"
func init() {
capnslog.SetGlobalLogLevel(capnslog.INFO)
capnslog.SetGlobalLogLevel(capnslog.CRITICAL)
}

View File

@ -347,7 +347,57 @@ func putAndWatch(t *testing.T, wctx *watchctx, key, val string) {
}
}
// TestWatchResumeComapcted checks that the watcher gracefully closes in case
func TestWatchResumeInitRev(t *testing.T) {
defer testutil.AfterTest(t)
clus := integration.NewClusterV3(t, &integration.ClusterConfig{Size: 1})
defer clus.Terminate(t)
cli := clus.Client(0)
if _, err := cli.Put(context.TODO(), "b", "2"); err != nil {
t.Fatal(err)
}
if _, err := cli.Put(context.TODO(), "a", "3"); err != nil {
t.Fatal(err)
}
// if resume is broken, it'll pick up this key first instead of a=3
if _, err := cli.Put(context.TODO(), "a", "4"); err != nil {
t.Fatal(err)
}
wch := clus.Client(0).Watch(context.Background(), "a", clientv3.WithRev(1), clientv3.WithCreatedNotify())
if resp, ok := <-wch; !ok || resp.Header.Revision != 4 {
t.Fatalf("got (%v, %v), expected create notification rev=4", resp, ok)
}
// pause wch
clus.Members[0].DropConnections()
clus.Members[0].PauseConnections()
select {
case resp, ok := <-wch:
t.Skipf("wch should block, got (%+v, %v); drop not fast enough", resp, ok)
case <-time.After(100 * time.Millisecond):
}
// resume wch
clus.Members[0].UnpauseConnections()
select {
case resp, ok := <-wch:
if !ok {
t.Fatal("unexpected watch close")
}
if len(resp.Events) == 0 {
t.Fatal("expected event on watch")
}
if string(resp.Events[0].Kv.Value) != "3" {
t.Fatalf("expected value=3, got event %+v", resp.Events[0])
}
case <-time.After(5 * time.Second):
t.Fatal("watch timed out")
}
}
// TestWatchResumeCompacted checks that the watcher gracefully closes in case
// that it tries to resume to a revision that's been compacted out of the store.
// Since the watcher's server restarts with stale data, the watcher will receive
// either a compaction error or all keys by staying in sync before the compaction

View File

@ -416,7 +416,7 @@ func (l *lessor) recvKeepAlive(resp *pb.LeaseKeepAliveResponse) {
}
// send update to all channels
nextKeepAlive := time.Now().Add(1 + time.Duration(karesp.TTL/3)*time.Second)
nextKeepAlive := time.Now().Add((time.Duration(karesp.TTL) * time.Second) / 3.0)
ka.deadline = time.Now().Add(time.Duration(karesp.TTL) * time.Second)
for _, ch := range ka.chs {
select {

View File

@ -132,6 +132,8 @@ type watchGrpcStream struct {
errc chan error
// closingc gets the watcherStream of closing watchers
closingc chan *watcherStream
// wg is Done when all substream goroutines have exited
wg sync.WaitGroup
// resumec closes to signal that all substreams should begin resuming
resumec chan struct{}
@ -406,7 +408,7 @@ func (w *watchGrpcStream) run() {
for range closing {
w.closeSubstream(<-w.closingc)
}
w.wg.Wait()
w.owner.closeStream(w)
}()
@ -431,6 +433,7 @@ func (w *watchGrpcStream) run() {
}
ws.donec = make(chan struct{})
w.wg.Add(1)
go w.serveSubstream(ws, w.resumec)
// queue up for watcher creation/resume
@ -576,6 +579,7 @@ func (w *watchGrpcStream) serveSubstream(ws *watcherStream, resumec chan struct{
if !resuming {
w.closingc <- ws
}
w.wg.Done()
}()
emptyWr := &WatchResponse{}
@ -612,10 +616,24 @@ func (w *watchGrpcStream) serveSubstream(ws *watcherStream, resumec chan struct{
if ws.initReq.createdNotify {
ws.outc <- *wr
}
// once the watch channel is returned, a current revision
// watch must resume at the store revision. This is necessary
// for the following case to work as expected:
// wch := m1.Watch("a")
// m2.Put("a", "b")
// <-wch
// If the revision is only bound on the first observed event,
// if wch is disconnected before the Put is issued, then reconnects
// after it is committed, it'll miss the Put.
if ws.initReq.rev == 0 {
nextRev = wr.Header.Revision
}
}
} else {
// current progress of watch; <= store revision
nextRev = wr.Header.Revision
}
nextRev = wr.Header.Revision
if len(wr.Events) > 0 {
nextRev = wr.Events[len(wr.Events)-1].Kv.ModRevision + 1
}
@ -674,6 +692,7 @@ func (w *watchGrpcStream) newWatchClient() (pb.Watch_WatchClient, error) {
continue
}
ws.donec = make(chan struct{})
w.wg.Add(1)
go w.serveSubstream(ws, w.resumec)
}
@ -694,6 +713,10 @@ func (w *watchGrpcStream) waitCancelSubstreams(stopc <-chan struct{}) <-chan str
go func(ws *watcherStream) {
defer wg.Done()
if ws.closing {
if ws.initReq.ctx.Err() != nil && ws.outc != nil {
close(ws.outc)
ws.outc = nil
}
return
}
select {

View File

@ -5,3 +5,6 @@ const maxMapSize = 0x7FFFFFFF // 2GB
// maxAllocSize is the size used when creating array pointers.
const maxAllocSize = 0xFFFFFFF
// Are unaligned load/stores broken on this arch?
var brokenUnaligned = false

View File

@ -5,3 +5,6 @@ const maxMapSize = 0xFFFFFFFFFFFF // 256TB
// maxAllocSize is the size used when creating array pointers.
const maxAllocSize = 0x7FFFFFFF
// Are unaligned load/stores broken on this arch?
var brokenUnaligned = false

28
cmd/vendor/github.com/coreos/bbolt/bolt_arm.go generated vendored Normal file
View File

@ -0,0 +1,28 @@
package bolt
import "unsafe"
// maxMapSize represents the largest mmap size supported by Bolt.
const maxMapSize = 0x7FFFFFFF // 2GB
// maxAllocSize is the size used when creating array pointers.
const maxAllocSize = 0xFFFFFFF
// Are unaligned load/stores broken on this arch?
var brokenUnaligned bool
func init() {
// Simple check to see whether this arch handles unaligned load/stores
// correctly.
// ARM9 and older devices require load/stores to be from/to aligned
// addresses. If not, the lower 2 bits are cleared and that address is
// read in a jumbled up order.
// See http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.html
raw := [6]byte{0xfe, 0xef, 0x11, 0x22, 0x22, 0x11}
val := *(*uint32)(unsafe.Pointer(uintptr(unsafe.Pointer(&raw)) + 2))
brokenUnaligned = val != 0x11222211
}

View File

@ -7,3 +7,6 @@ const maxMapSize = 0xFFFFFFFFFFFF // 256TB
// maxAllocSize is the size used when creating array pointers.
const maxAllocSize = 0x7FFFFFFF
// Are unaligned load/stores broken on this arch?
var brokenUnaligned = false

12
cmd/vendor/github.com/coreos/bbolt/bolt_mips64x.go generated vendored Normal file
View File

@ -0,0 +1,12 @@
// +build mips64 mips64le
package bolt
// maxMapSize represents the largest mmap size supported by Bolt.
const maxMapSize = 0x8000000000 // 512GB
// maxAllocSize is the size used when creating array pointers.
const maxAllocSize = 0x7FFFFFFF
// Are unaligned load/stores broken on this arch?
var brokenUnaligned = false

View File

@ -1,7 +1,12 @@
// +build mips mipsle
package bolt
// maxMapSize represents the largest mmap size supported by Bolt.
const maxMapSize = 0x7FFFFFFF // 2GB
const maxMapSize = 0x40000000 // 1GB
// maxAllocSize is the size used when creating array pointers.
const maxAllocSize = 0xFFFFFFF
// Are unaligned load/stores broken on this arch?
var brokenUnaligned = false

View File

@ -7,3 +7,6 @@ const maxMapSize = 0xFFFFFFFFFFFF // 256TB
// maxAllocSize is the size used when creating array pointers.
const maxAllocSize = 0x7FFFFFFF
// Are unaligned load/stores broken on this arch?
var brokenUnaligned = false

View File

@ -7,3 +7,6 @@ const maxMapSize = 0xFFFFFFFFFFFF // 256TB
// maxAllocSize is the size used when creating array pointers.
const maxAllocSize = 0x7FFFFFFF
// Are unaligned load/stores broken on this arch?
var brokenUnaligned = false

View File

@ -7,3 +7,6 @@ const maxMapSize = 0xFFFFFFFFFFFF // 256TB
// maxAllocSize is the size used when creating array pointers.
const maxAllocSize = 0x7FFFFFFF
// Are unaligned load/stores broken on this arch?
var brokenUnaligned = false

View File

@ -13,29 +13,32 @@ import (
// flock acquires an advisory lock on a file descriptor.
func flock(db *DB, mode os.FileMode, exclusive bool, timeout time.Duration) error {
var t time.Time
if timeout != 0 {
t = time.Now()
}
fd := db.file.Fd()
flag := syscall.LOCK_NB
if exclusive {
flag |= syscall.LOCK_EX
} else {
flag |= syscall.LOCK_SH
}
for {
// If we're beyond our timeout then return an error.
// This can only occur after we've attempted a flock once.
if t.IsZero() {
t = time.Now()
} else if timeout > 0 && time.Since(t) > timeout {
return ErrTimeout
}
flag := syscall.LOCK_SH
if exclusive {
flag = syscall.LOCK_EX
}
// Otherwise attempt to obtain an exclusive lock.
err := syscall.Flock(int(db.file.Fd()), flag|syscall.LOCK_NB)
// Attempt to obtain an exclusive lock.
err := syscall.Flock(int(fd), flag)
if err == nil {
return nil
} else if err != syscall.EWOULDBLOCK {
return err
}
// If we timed out then return an error.
if timeout != 0 && time.Since(t) > timeout-flockRetryTimeout {
return ErrTimeout
}
// Wait for a bit and try again.
time.Sleep(50 * time.Millisecond)
time.Sleep(flockRetryTimeout)
}
}

View File

@ -13,34 +13,33 @@ import (
// flock acquires an advisory lock on a file descriptor.
func flock(db *DB, mode os.FileMode, exclusive bool, timeout time.Duration) error {
var t time.Time
if timeout != 0 {
t = time.Now()
}
fd := db.file.Fd()
var lockType int16
if exclusive {
lockType = syscall.F_WRLCK
} else {
lockType = syscall.F_RDLCK
}
for {
// If we're beyond our timeout then return an error.
// This can only occur after we've attempted a flock once.
if t.IsZero() {
t = time.Now()
} else if timeout > 0 && time.Since(t) > timeout {
return ErrTimeout
}
var lock syscall.Flock_t
lock.Start = 0
lock.Len = 0
lock.Pid = 0
lock.Whence = 0
lock.Pid = 0
if exclusive {
lock.Type = syscall.F_WRLCK
} else {
lock.Type = syscall.F_RDLCK
}
err := syscall.FcntlFlock(db.file.Fd(), syscall.F_SETLK, &lock)
// Attempt to obtain an exclusive lock.
lock := syscall.Flock_t{Type: lockType}
err := syscall.FcntlFlock(fd, syscall.F_SETLK, &lock)
if err == nil {
return nil
} else if err != syscall.EAGAIN {
return err
}
// If we timed out then return an error.
if timeout != 0 && time.Since(t) > timeout-flockRetryTimeout {
return ErrTimeout
}
// Wait for a bit and try again.
time.Sleep(50 * time.Millisecond)
time.Sleep(flockRetryTimeout)
}
}

View File

@ -59,29 +59,30 @@ func flock(db *DB, mode os.FileMode, exclusive bool, timeout time.Duration) erro
db.lockfile = f
var t time.Time
if timeout != 0 {
t = time.Now()
}
fd := f.Fd()
var flag uint32 = flagLockFailImmediately
if exclusive {
flag |= flagLockExclusive
}
for {
// If we're beyond our timeout then return an error.
// This can only occur after we've attempted a flock once.
if t.IsZero() {
t = time.Now()
} else if timeout > 0 && time.Since(t) > timeout {
return ErrTimeout
}
var flag uint32 = flagLockFailImmediately
if exclusive {
flag |= flagLockExclusive
}
err := lockFileEx(syscall.Handle(db.lockfile.Fd()), flag, 0, 1, 0, &syscall.Overlapped{})
// Attempt to obtain an exclusive lock.
err := lockFileEx(syscall.Handle(fd), flag, 0, 1, 0, &syscall.Overlapped{})
if err == nil {
return nil
} else if err != errLockViolation {
return err
}
// If we timed oumercit then return an error.
if timeout != 0 && time.Since(t) > timeout-flockRetryTimeout {
return ErrTimeout
}
// Wait for a bit and try again.
time.Sleep(50 * time.Millisecond)
time.Sleep(flockRetryTimeout)
}
}
@ -89,7 +90,7 @@ func flock(db *DB, mode os.FileMode, exclusive bool, timeout time.Duration) erro
func funlock(db *DB) error {
err := unlockFileEx(syscall.Handle(db.lockfile.Fd()), 0, 1, 0, &syscall.Overlapped{})
db.lockfile.Close()
os.Remove(db.path+lockExt)
os.Remove(db.path + lockExt)
return err
}

View File

@ -14,13 +14,6 @@ const (
MaxValueSize = (1 << 31) - 2
)
const (
maxUint = ^uint(0)
minUint = 0
maxInt = int(^uint(0) >> 1)
minInt = -maxInt - 1
)
const bucketHeaderSize = int(unsafe.Sizeof(bucket{}))
const (
@ -130,9 +123,17 @@ func (b *Bucket) Bucket(name []byte) *Bucket {
func (b *Bucket) openBucket(value []byte) *Bucket {
var child = newBucket(b.tx)
// If unaligned load/stores are broken on this arch and value is
// unaligned simply clone to an aligned byte array.
unaligned := brokenUnaligned && uintptr(unsafe.Pointer(&value[0]))&3 != 0
if unaligned {
value = cloneBytes(value)
}
// If this is a writable transaction then we need to copy the bucket entry.
// Read-only transactions can point directly at the mmap entry.
if b.tx.writable {
if b.tx.writable && !unaligned {
child.bucket = &bucket{}
*child.bucket = *(*bucket)(unsafe.Pointer(&value[0]))
} else {
@ -167,9 +168,8 @@ func (b *Bucket) CreateBucket(key []byte) (*Bucket, error) {
if bytes.Equal(key, k) {
if (flags & bucketLeafFlag) != 0 {
return nil, ErrBucketExists
} else {
return nil, ErrIncompatibleValue
}
return nil, ErrIncompatibleValue
}
// Create empty, inline bucket.
@ -316,7 +316,12 @@ func (b *Bucket) Delete(key []byte) error {
// Move cursor to correct position.
c := b.Cursor()
_, _, flags := c.seek(key)
k, _, flags := c.seek(key)
// Return nil if the key doesn't exist.
if !bytes.Equal(key, k) {
return nil
}
// Return an error if there is already existing bucket value.
if (flags & bucketLeafFlag) != 0 {
@ -329,6 +334,28 @@ func (b *Bucket) Delete(key []byte) error {
return nil
}
// Sequence returns the current integer for the bucket without incrementing it.
func (b *Bucket) Sequence() uint64 { return b.bucket.sequence }
// SetSequence updates the sequence number for the bucket.
func (b *Bucket) SetSequence(v uint64) error {
if b.tx.db == nil {
return ErrTxClosed
} else if !b.Writable() {
return ErrTxNotWritable
}
// Materialize the root node if it hasn't been already so that the
// bucket will be saved during commit.
if b.rootNode == nil {
_ = b.node(b.root, nil)
}
// Increment and return the sequence.
b.bucket.sequence = v
return nil
}
// NextSequence returns an autoincrementing integer for the bucket.
func (b *Bucket) NextSequence() (uint64, error) {
if b.tx.db == nil {

View File

@ -7,8 +7,7 @@ import (
"log"
"os"
"runtime"
"runtime/debug"
"strings"
"sort"
"sync"
"time"
"unsafe"
@ -23,6 +22,8 @@ const version = 2
// Represents a marker value to indicate that a file is a Bolt DB.
const magic uint32 = 0xED0CDAED
const pgidNoFreelist pgid = 0xffffffffffffffff
// IgnoreNoSync specifies whether the NoSync field of a DB is ignored when
// syncing changes to a file. This is required as some operating systems,
// such as OpenBSD, do not have a unified buffer cache (UBC) and writes
@ -39,6 +40,9 @@ const (
// default page size for db is set to the OS page size.
var defaultPageSize = os.Getpagesize()
// The time elapsed between consecutive file locking attempts.
const flockRetryTimeout = 50 * time.Millisecond
// DB represents a collection of buckets persisted to a file on disk.
// All data access is performed through transactions which can be obtained through the DB.
// All the functions on DB will return a ErrDatabaseNotOpen if accessed before Open() is called.
@ -61,6 +65,11 @@ type DB struct {
// THIS IS UNSAFE. PLEASE USE WITH CAUTION.
NoSync bool
// When true, skips syncing freelist to disk. This improves the database
// write performance under normal operation, but requires a full database
// re-sync during recovery.
NoFreelistSync bool
// When true, skips the truncate call when growing the database.
// Setting this to true is only safe on non-ext3/ext4 systems.
// Skipping truncation avoids preallocation of hard drive space and
@ -107,9 +116,11 @@ type DB struct {
opened bool
rwtx *Tx
txs []*Tx
freelist *freelist
stats Stats
freelist *freelist
freelistLoad sync.Once
pagePool sync.Pool
batchMu sync.Mutex
@ -148,14 +159,17 @@ func (db *DB) String() string {
// If the file does not exist then it will be created automatically.
// Passing in nil options will cause Bolt to open the database with the default options.
func Open(path string, mode os.FileMode, options *Options) (*DB, error) {
var db = &DB{opened: true}
db := &DB{
opened: true,
}
// Set default options if no options are provided.
if options == nil {
options = DefaultOptions
}
db.NoSync = options.NoSync
db.NoGrowSync = options.NoGrowSync
db.MmapFlags = options.MmapFlags
db.NoFreelistSync = options.NoFreelistSync
// Set default values for later DB operations.
db.MaxBatchSize = DefaultMaxBatchSize
@ -184,6 +198,7 @@ func Open(path string, mode os.FileMode, options *Options) (*DB, error) {
// The database file is locked using the shared lock (more than one process may
// hold a lock at the same time) otherwise (options.ReadOnly is set).
if err := flock(db, mode, !db.readOnly, options.Timeout); err != nil {
db.lockfile = nil // make 'unused' happy. TODO: rework locks
_ = db.close()
return nil, err
}
@ -191,6 +206,11 @@ func Open(path string, mode os.FileMode, options *Options) (*DB, error) {
// Default values for test hooks
db.ops.writeAt = db.file.WriteAt
if db.pageSize = options.PageSize; db.pageSize == 0 {
// Set the default page size to the OS page size.
db.pageSize = defaultPageSize
}
// Initialize the database if it doesn't exist.
if info, err := db.file.Stat(); err != nil {
return nil, err
@ -202,20 +222,21 @@ func Open(path string, mode os.FileMode, options *Options) (*DB, error) {
} else {
// Read the first meta page to determine the page size.
var buf [0x1000]byte
if _, err := db.file.ReadAt(buf[:], 0); err == nil {
m := db.pageInBuffer(buf[:], 0).meta()
if err := m.validate(); err != nil {
// If we can't read the page size, we can assume it's the same
// as the OS -- since that's how the page size was chosen in the
// first place.
//
// If the first page is invalid and this OS uses a different
// page size than what the database was created with then we
// are out of luck and cannot access the database.
db.pageSize = os.Getpagesize()
} else {
// If we can't read the page size, but can read a page, assume
// it's the same as the OS or one given -- since that's how the
// page size was chosen in the first place.
//
// If the first page is invalid and this OS uses a different
// page size than what the database was created with then we
// are out of luck and cannot access the database.
//
// TODO: scan for next page
if bw, err := db.file.ReadAt(buf[:], 0); err == nil && bw == len(buf) {
if m := db.pageInBuffer(buf[:], 0).meta(); m.validate() == nil {
db.pageSize = int(m.pageSize)
}
} else {
return nil, ErrInvalid
}
}
@ -232,14 +253,50 @@ func Open(path string, mode os.FileMode, options *Options) (*DB, error) {
return nil, err
}
// Read in the freelist.
db.freelist = newFreelist()
db.freelist.read(db.page(db.meta().freelist))
if db.readOnly {
return db, nil
}
db.loadFreelist()
// Flush freelist when transitioning from no sync to sync so
// NoFreelistSync unaware boltdb can open the db later.
if !db.NoFreelistSync && !db.hasSyncedFreelist() {
tx, err := db.Begin(true)
if tx != nil {
err = tx.Commit()
}
if err != nil {
_ = db.close()
return nil, err
}
}
// Mark the database as opened and return.
return db, nil
}
// loadFreelist reads the freelist if it is synced, or reconstructs it
// by scanning the DB if it is not synced. It assumes there are no
// concurrent accesses being made to the freelist.
func (db *DB) loadFreelist() {
db.freelistLoad.Do(func() {
db.freelist = newFreelist()
if !db.hasSyncedFreelist() {
// Reconstruct free list by scanning the DB.
db.freelist.readIDs(db.freepages())
} else {
// Read free list from freelist page.
db.freelist.read(db.page(db.meta().freelist))
}
db.stats.FreePageN = len(db.freelist.ids)
})
}
func (db *DB) hasSyncedFreelist() bool {
return db.meta().freelist != pgidNoFreelist
}
// mmap opens the underlying memory-mapped file and initializes the meta references.
// minsz is the minimum size that the new mmap can be.
func (db *DB) mmap(minsz int) error {
@ -341,9 +398,6 @@ func (db *DB) mmapSize(size int) (int, error) {
// init creates a new database file and initializes its meta pages.
func (db *DB) init() error {
// Set the page size to the OS page size.
db.pageSize = os.Getpagesize()
// Create two meta pages on a buffer.
buf := make([]byte, db.pageSize*4)
for i := 0; i < 2; i++ {
@ -526,21 +580,36 @@ func (db *DB) beginRWTx() (*Tx, error) {
t := &Tx{writable: true}
t.init(db)
db.rwtx = t
db.freePages()
return t, nil
}
// Free any pages associated with closed read-only transactions.
var minid txid = 0xFFFFFFFFFFFFFFFF
for _, t := range db.txs {
if t.meta.txid < minid {
minid = t.meta.txid
}
// freePages releases any pages associated with closed read-only transactions.
func (db *DB) freePages() {
// Free all pending pages prior to earliest open transaction.
sort.Sort(txsById(db.txs))
minid := txid(0xFFFFFFFFFFFFFFFF)
if len(db.txs) > 0 {
minid = db.txs[0].meta.txid
}
if minid > 0 {
db.freelist.release(minid - 1)
}
return t, nil
// Release unused txid extents.
for _, t := range db.txs {
db.freelist.releaseRange(minid, t.meta.txid-1)
minid = t.meta.txid + 1
}
db.freelist.releaseRange(minid, txid(0xFFFFFFFFFFFFFFFF))
// Any page both allocated and freed in an extent is safe to release.
}
type txsById []*Tx
func (t txsById) Len() int { return len(t) }
func (t txsById) Swap(i, j int) { t[i], t[j] = t[j], t[i] }
func (t txsById) Less(i, j int) bool { return t[i].meta.txid < t[j].meta.txid }
// removeTx removes a transaction from the database.
func (db *DB) removeTx(tx *Tx) {
// Release the read lock on the mmap.
@ -552,7 +621,10 @@ func (db *DB) removeTx(tx *Tx) {
// Remove the transaction.
for i, t := range db.txs {
if t == tx {
db.txs = append(db.txs[:i], db.txs[i+1:]...)
last := len(db.txs) - 1
db.txs[i] = db.txs[last]
db.txs[last] = nil
db.txs = db.txs[:last]
break
}
}
@ -630,11 +702,7 @@ func (db *DB) View(fn func(*Tx) error) error {
return err
}
if err := t.Rollback(); err != nil {
return err
}
return nil
return t.Rollback()
}
// Batch calls fn as part of a batch. It behaves similar to Update,
@ -823,7 +891,7 @@ func (db *DB) meta() *meta {
}
// allocate returns a contiguous block of memory starting at a given page.
func (db *DB) allocate(count int) (*page, error) {
func (db *DB) allocate(txid txid, count int) (*page, error) {
// Allocate a temporary buffer for the page.
var buf []byte
if count == 1 {
@ -835,7 +903,7 @@ func (db *DB) allocate(count int) (*page, error) {
p.overflow = uint32(count - 1)
// Use pages from the freelist if they are available.
if p.id = db.freelist.allocate(count); p.id != 0 {
if p.id = db.freelist.allocate(txid, count); p.id != 0 {
return p, nil
}
@ -890,6 +958,38 @@ func (db *DB) IsReadOnly() bool {
return db.readOnly
}
func (db *DB) freepages() []pgid {
tx, err := db.beginTx()
defer func() {
err = tx.Rollback()
if err != nil {
panic("freepages: failed to rollback tx")
}
}()
if err != nil {
panic("freepages: failed to open read only tx")
}
reachable := make(map[pgid]*page)
nofreed := make(map[pgid]bool)
ech := make(chan error)
go func() {
for e := range ech {
panic(fmt.Sprintf("freepages: failed to get all reachable pages (%v)", e))
}
}()
tx.checkBucket(&tx.root, reachable, nofreed, ech)
close(ech)
var fids []pgid
for i := pgid(2); i < db.meta().pgid; i++ {
if _, ok := reachable[i]; !ok {
fids = append(fids, i)
}
}
return fids
}
// Options represents the options that can be set when opening a database.
type Options struct {
// Timeout is the amount of time to wait to obtain a file lock.
@ -900,6 +1000,10 @@ type Options struct {
// Sets the DB.NoGrowSync flag before memory mapping the file.
NoGrowSync bool
// Do not sync freelist to disk. This improves the database write performance
// under normal operation, but requires a full database re-sync during recovery.
NoFreelistSync bool
// Open database in read-only mode. Uses flock(..., LOCK_SH |LOCK_NB) to
// grab a shared lock (UNIX).
ReadOnly bool
@ -916,6 +1020,14 @@ type Options struct {
// If initialMmapSize is smaller than the previous database size,
// it takes no effect.
InitialMmapSize int
// PageSize overrides the default OS page size.
PageSize int
// NoSync sets the initial value of DB.NoSync. Normally this can just be
// set directly on the DB itself when returned from Open(), but this option
// is useful in APIs which expose Options but not the underlying DB.
NoSync bool
}
// DefaultOptions represent the options used if nil options are passed into Open().
@ -952,15 +1064,11 @@ func (s *Stats) Sub(other *Stats) Stats {
diff.PendingPageN = s.PendingPageN
diff.FreeAlloc = s.FreeAlloc
diff.FreelistInuse = s.FreelistInuse
diff.TxN = other.TxN - s.TxN
diff.TxN = s.TxN - other.TxN
diff.TxStats = s.TxStats.Sub(&other.TxStats)
return diff
}
func (s *Stats) add(other *Stats) {
s.TxStats.add(&other.TxStats)
}
type Info struct {
Data uintptr
PageSize int
@ -999,7 +1107,8 @@ func (m *meta) copy(dest *meta) {
func (m *meta) write(p *page) {
if m.root.root >= m.pgid {
panic(fmt.Sprintf("root bucket pgid (%d) above high water mark (%d)", m.root.root, m.pgid))
} else if m.freelist >= m.pgid {
} else if m.freelist >= m.pgid && m.freelist != pgidNoFreelist {
// TODO: reject pgidNoFreeList if !NoFreelistSync
panic(fmt.Sprintf("freelist pgid (%d) above high water mark (%d)", m.freelist, m.pgid))
}
@ -1026,11 +1135,3 @@ func _assert(condition bool, msg string, v ...interface{}) {
panic(fmt.Sprintf("assertion failed: "+msg, v...))
}
}
func warn(v ...interface{}) { fmt.Fprintln(os.Stderr, v...) }
func warnf(msg string, v ...interface{}) { fmt.Fprintf(os.Stderr, msg+"\n", v...) }
func printstack() {
stack := strings.Join(strings.Split(string(debug.Stack()), "\n")[2:], "\n")
fmt.Fprintln(os.Stderr, stack)
}

View File

@ -6,25 +6,40 @@ import (
"unsafe"
)
// txPending holds a list of pgids and corresponding allocation txns
// that are pending to be freed.
type txPending struct {
ids []pgid
alloctx []txid // txids allocating the ids
lastReleaseBegin txid // beginning txid of last matching releaseRange
}
// freelist represents a list of all pages that are available for allocation.
// It also tracks pages that have been freed but are still in use by open transactions.
type freelist struct {
ids []pgid // all free and available free page ids.
pending map[txid][]pgid // mapping of soon-to-be free page ids by tx.
cache map[pgid]bool // fast lookup of all free and pending page ids.
ids []pgid // all free and available free page ids.
allocs map[pgid]txid // mapping of txid that allocated a pgid.
pending map[txid]*txPending // mapping of soon-to-be free page ids by tx.
cache map[pgid]bool // fast lookup of all free and pending page ids.
}
// newFreelist returns an empty, initialized freelist.
func newFreelist() *freelist {
return &freelist{
pending: make(map[txid][]pgid),
allocs: make(map[pgid]txid),
pending: make(map[txid]*txPending),
cache: make(map[pgid]bool),
}
}
// size returns the size of the page after serialization.
func (f *freelist) size() int {
return pageHeaderSize + (int(unsafe.Sizeof(pgid(0))) * f.count())
n := f.count()
if n >= 0xFFFF {
// The first element will be used to store the count. See freelist.write.
n++
}
return pageHeaderSize + (int(unsafe.Sizeof(pgid(0))) * n)
}
// count returns count of pages on the freelist
@ -40,27 +55,26 @@ func (f *freelist) free_count() int {
// pending_count returns count of pending pages
func (f *freelist) pending_count() int {
var count int
for _, list := range f.pending {
count += len(list)
for _, txp := range f.pending {
count += len(txp.ids)
}
return count
}
// all returns a list of all free ids and all pending ids in one sorted list.
func (f *freelist) all() []pgid {
m := make(pgids, 0)
for _, list := range f.pending {
m = append(m, list...)
// copyall copies into dst a list of all free ids and all pending ids in one sorted list.
// f.count returns the minimum length required for dst.
func (f *freelist) copyall(dst []pgid) {
m := make(pgids, 0, f.pending_count())
for _, txp := range f.pending {
m = append(m, txp.ids...)
}
sort.Sort(m)
return pgids(f.ids).merge(m)
mergepgids(dst, f.ids, m)
}
// allocate returns the starting page id of a contiguous list of pages of a given size.
// If a contiguous block cannot be found then 0 is returned.
func (f *freelist) allocate(n int) pgid {
func (f *freelist) allocate(txid txid, n int) pgid {
if len(f.ids) == 0 {
return 0
}
@ -93,7 +107,7 @@ func (f *freelist) allocate(n int) pgid {
for i := pgid(0); i < pgid(n); i++ {
delete(f.cache, initial+i)
}
f.allocs[initial] = txid
return initial
}
@ -110,28 +124,73 @@ func (f *freelist) free(txid txid, p *page) {
}
// Free page and all its overflow pages.
var ids = f.pending[txid]
txp := f.pending[txid]
if txp == nil {
txp = &txPending{}
f.pending[txid] = txp
}
allocTxid, ok := f.allocs[p.id]
if ok {
delete(f.allocs, p.id)
} else if (p.flags & freelistPageFlag) != 0 {
// Freelist is always allocated by prior tx.
allocTxid = txid - 1
}
for id := p.id; id <= p.id+pgid(p.overflow); id++ {
// Verify that page is not already free.
if f.cache[id] {
panic(fmt.Sprintf("page %d already freed", id))
}
// Add to the freelist and cache.
ids = append(ids, id)
txp.ids = append(txp.ids, id)
txp.alloctx = append(txp.alloctx, allocTxid)
f.cache[id] = true
}
f.pending[txid] = ids
}
// release moves all page ids for a transaction id (or older) to the freelist.
func (f *freelist) release(txid txid) {
m := make(pgids, 0)
for tid, ids := range f.pending {
for tid, txp := range f.pending {
if tid <= txid {
// Move transaction's pending pages to the available freelist.
// Don't remove from the cache since the page is still free.
m = append(m, ids...)
m = append(m, txp.ids...)
delete(f.pending, tid)
}
}
sort.Sort(m)
f.ids = pgids(f.ids).merge(m)
}
// releaseRange moves pending pages allocated within an extent [begin,end] to the free list.
func (f *freelist) releaseRange(begin, end txid) {
if begin > end {
return
}
var m pgids
for tid, txp := range f.pending {
if tid < begin || tid > end {
continue
}
// Don't recompute freed pages if ranges haven't updated.
if txp.lastReleaseBegin == begin {
continue
}
for i := 0; i < len(txp.ids); i++ {
if atx := txp.alloctx[i]; atx < begin || atx > end {
continue
}
m = append(m, txp.ids[i])
txp.ids[i] = txp.ids[len(txp.ids)-1]
txp.ids = txp.ids[:len(txp.ids)-1]
txp.alloctx[i] = txp.alloctx[len(txp.alloctx)-1]
txp.alloctx = txp.alloctx[:len(txp.alloctx)-1]
i--
}
txp.lastReleaseBegin = begin
if len(txp.ids) == 0 {
delete(f.pending, tid)
}
}
@ -142,12 +201,29 @@ func (f *freelist) release(txid txid) {
// rollback removes the pages from a given pending tx.
func (f *freelist) rollback(txid txid) {
// Remove page ids from cache.
for _, id := range f.pending[txid] {
delete(f.cache, id)
txp := f.pending[txid]
if txp == nil {
return
}
// Remove pages from pending list.
var m pgids
for i, pgid := range txp.ids {
delete(f.cache, pgid)
tx := txp.alloctx[i]
if tx == 0 {
continue
}
if tx != txid {
// Pending free aborted; restore page back to alloc list.
f.allocs[pgid] = tx
} else {
// Freed page was allocated by this txn; OK to throw away.
m = append(m, pgid)
}
}
// Remove pages from pending list and mark as free if allocated by txid.
delete(f.pending, txid)
sort.Sort(m)
f.ids = pgids(f.ids).merge(m)
}
// freed returns whether a given page is in the free list.
@ -157,6 +233,9 @@ func (f *freelist) freed(pgid pgid) bool {
// read initializes the freelist from a freelist page.
func (f *freelist) read(p *page) {
if (p.flags & freelistPageFlag) == 0 {
panic(fmt.Sprintf("invalid freelist page: %d, page type is %s", p.id, p.typ()))
}
// If the page.count is at the max uint16 value (64k) then it's considered
// an overflow and the size of the freelist is stored as the first element.
idx, count := 0, int(p.count)
@ -169,7 +248,7 @@ func (f *freelist) read(p *page) {
if count == 0 {
f.ids = nil
} else {
ids := ((*[maxAllocSize]pgid)(unsafe.Pointer(&p.ptr)))[idx:count]
ids := ((*[maxAllocSize]pgid)(unsafe.Pointer(&p.ptr)))[idx : idx+count]
f.ids = make([]pgid, len(ids))
copy(f.ids, ids)
@ -181,27 +260,33 @@ func (f *freelist) read(p *page) {
f.reindex()
}
// read initializes the freelist from a given list of ids.
func (f *freelist) readIDs(ids []pgid) {
f.ids = ids
f.reindex()
}
// write writes the page ids onto a freelist page. All free and pending ids are
// saved to disk since in the event of a program crash, all pending ids will
// become free.
func (f *freelist) write(p *page) error {
// Combine the old free pgids and pgids waiting on an open transaction.
ids := f.all()
// Update the header flag.
p.flags |= freelistPageFlag
// The page.count can only hold up to 64k elements so if we overflow that
// number then we handle it by putting the size in the first element.
if len(ids) == 0 {
p.count = uint16(len(ids))
} else if len(ids) < 0xFFFF {
p.count = uint16(len(ids))
copy(((*[maxAllocSize]pgid)(unsafe.Pointer(&p.ptr)))[:], ids)
lenids := f.count()
if lenids == 0 {
p.count = uint16(lenids)
} else if lenids < 0xFFFF {
p.count = uint16(lenids)
f.copyall(((*[maxAllocSize]pgid)(unsafe.Pointer(&p.ptr)))[:])
} else {
p.count = 0xFFFF
((*[maxAllocSize]pgid)(unsafe.Pointer(&p.ptr)))[0] = pgid(len(ids))
copy(((*[maxAllocSize]pgid)(unsafe.Pointer(&p.ptr)))[1:], ids)
((*[maxAllocSize]pgid)(unsafe.Pointer(&p.ptr)))[0] = pgid(lenids)
f.copyall(((*[maxAllocSize]pgid)(unsafe.Pointer(&p.ptr)))[1:])
}
return nil
@ -213,8 +298,8 @@ func (f *freelist) reload(p *page) {
// Build a cache of only pending pages.
pcache := make(map[pgid]bool)
for _, pendingIDs := range f.pending {
for _, pendingID := range pendingIDs {
for _, txp := range f.pending {
for _, pendingID := range txp.ids {
pcache[pendingID] = true
}
}
@ -236,12 +321,12 @@ func (f *freelist) reload(p *page) {
// reindex rebuilds the free cache based on available and pending free lists.
func (f *freelist) reindex() {
f.cache = make(map[pgid]bool)
f.cache = make(map[pgid]bool, len(f.ids))
for _, id := range f.ids {
f.cache[id] = true
}
for _, pendingIDs := range f.pending {
for _, pendingID := range pendingIDs {
for _, txp := range f.pending {
for _, pendingID := range txp.ids {
f.cache[pendingID] = true
}
}

View File

@ -365,7 +365,7 @@ func (n *node) spill() error {
}
// Allocate contiguous space for the node.
p, err := tx.allocate((node.size() / tx.db.pageSize) + 1)
p, err := tx.allocate((node.size() + tx.db.pageSize - 1) / tx.db.pageSize)
if err != nil {
return err
}

View File

@ -145,12 +145,33 @@ func (a pgids) merge(b pgids) pgids {
// Return the opposite slice if one is nil.
if len(a) == 0 {
return b
} else if len(b) == 0 {
}
if len(b) == 0 {
return a
}
merged := make(pgids, len(a)+len(b))
mergepgids(merged, a, b)
return merged
}
// Create a list to hold all elements from both lists.
merged := make(pgids, 0, len(a)+len(b))
// mergepgids copies the sorted union of a and b into dst.
// If dst is too small, it panics.
func mergepgids(dst, a, b pgids) {
if len(dst) < len(a)+len(b) {
panic(fmt.Errorf("mergepgids bad len %d < %d + %d", len(dst), len(a), len(b)))
}
// Copy in the opposite slice if one is nil.
if len(a) == 0 {
copy(dst, b)
return
}
if len(b) == 0 {
copy(dst, a)
return
}
// Merged will hold all elements from both lists.
merged := dst[:0]
// Assign lead to the slice with a lower starting value, follow to the higher value.
lead, follow := a, b
@ -172,7 +193,5 @@ func (a pgids) merge(b pgids) pgids {
}
// Append what's left in follow.
merged = append(merged, follow...)
return merged
_ = append(merged, follow...)
}

View File

@ -126,10 +126,7 @@ func (tx *Tx) DeleteBucket(name []byte) error {
// the error is returned to the caller.
func (tx *Tx) ForEach(fn func(name []byte, b *Bucket) error) error {
return tx.root.ForEach(func(k, v []byte) error {
if err := fn(k, tx.root.Bucket(k)); err != nil {
return err
}
return nil
return fn(k, tx.root.Bucket(k))
})
}
@ -169,28 +166,18 @@ func (tx *Tx) Commit() error {
// Free the old root bucket.
tx.meta.root.root = tx.root.root
opgid := tx.meta.pgid
// Free the freelist and allocate new pages for it. This will overestimate
// the size of the freelist but not underestimate the size (which would be bad).
tx.db.freelist.free(tx.meta.txid, tx.db.page(tx.meta.freelist))
p, err := tx.allocate((tx.db.freelist.size() / tx.db.pageSize) + 1)
if err != nil {
tx.rollback()
return err
// Free the old freelist because commit writes out a fresh freelist.
if tx.meta.freelist != pgidNoFreelist {
tx.db.freelist.free(tx.meta.txid, tx.db.page(tx.meta.freelist))
}
if err := tx.db.freelist.write(p); err != nil {
tx.rollback()
return err
}
tx.meta.freelist = p.id
// If the high water mark has moved up then attempt to grow the database.
if tx.meta.pgid > opgid {
if err := tx.db.grow(int(tx.meta.pgid+1) * tx.db.pageSize); err != nil {
tx.rollback()
if !tx.db.NoFreelistSync {
err := tx.commitFreelist()
if err != nil {
return err
}
} else {
tx.meta.freelist = pgidNoFreelist
}
// Write dirty pages to disk.
@ -235,6 +222,31 @@ func (tx *Tx) Commit() error {
return nil
}
func (tx *Tx) commitFreelist() error {
// Allocate new pages for the new free list. This will overestimate
// the size of the freelist but not underestimate the size (which would be bad).
opgid := tx.meta.pgid
p, err := tx.allocate((tx.db.freelist.size() / tx.db.pageSize) + 1)
if err != nil {
tx.rollback()
return err
}
if err := tx.db.freelist.write(p); err != nil {
tx.rollback()
return err
}
tx.meta.freelist = p.id
// If the high water mark has moved up then attempt to grow the database.
if tx.meta.pgid > opgid {
if err := tx.db.grow(int(tx.meta.pgid+1) * tx.db.pageSize); err != nil {
tx.rollback()
return err
}
}
return nil
}
// Rollback closes the transaction and ignores all previous updates. Read-only
// transactions must be rolled back and not committed.
func (tx *Tx) Rollback() error {
@ -305,7 +317,11 @@ func (tx *Tx) WriteTo(w io.Writer) (n int64, err error) {
if err != nil {
return 0, err
}
defer func() { _ = f.Close() }()
defer func() {
if cerr := f.Close(); err == nil {
err = cerr
}
}()
// Generate a meta page. We use the same page data for both meta pages.
buf := make([]byte, tx.db.pageSize)
@ -333,7 +349,7 @@ func (tx *Tx) WriteTo(w io.Writer) (n int64, err error) {
}
// Move past the meta pages in the file.
if _, err := f.Seek(int64(tx.db.pageSize*2), os.SEEK_SET); err != nil {
if _, err := f.Seek(int64(tx.db.pageSize*2), io.SeekStart); err != nil {
return n, fmt.Errorf("seek: %s", err)
}
@ -344,7 +360,7 @@ func (tx *Tx) WriteTo(w io.Writer) (n int64, err error) {
return n, err
}
return n, f.Close()
return n, nil
}
// CopyFile copies the entire database to file at the given path.
@ -379,9 +395,14 @@ func (tx *Tx) Check() <-chan error {
}
func (tx *Tx) check(ch chan error) {
// Force loading free list if opened in ReadOnly mode.
tx.db.loadFreelist()
// Check if any pages are double freed.
freed := make(map[pgid]bool)
for _, id := range tx.db.freelist.all() {
all := make([]pgid, tx.db.freelist.count())
tx.db.freelist.copyall(all)
for _, id := range all {
if freed[id] {
ch <- fmt.Errorf("page %d: already freed", id)
}
@ -392,8 +413,10 @@ func (tx *Tx) check(ch chan error) {
reachable := make(map[pgid]*page)
reachable[0] = tx.page(0) // meta0
reachable[1] = tx.page(1) // meta1
for i := uint32(0); i <= tx.page(tx.meta.freelist).overflow; i++ {
reachable[tx.meta.freelist+pgid(i)] = tx.page(tx.meta.freelist)
if tx.meta.freelist != pgidNoFreelist {
for i := uint32(0); i <= tx.page(tx.meta.freelist).overflow; i++ {
reachable[tx.meta.freelist+pgid(i)] = tx.page(tx.meta.freelist)
}
}
// Recursively check buckets.
@ -451,7 +474,7 @@ func (tx *Tx) checkBucket(b *Bucket, reachable map[pgid]*page, freed map[pgid]bo
// allocate returns a contiguous block of memory starting at a given page.
func (tx *Tx) allocate(count int) (*page, error) {
p, err := tx.db.allocate(count)
p, err := tx.db.allocate(tx.meta.txid, count)
if err != nil {
return nil, err
}

View File

@ -36,12 +36,7 @@
// Contexts.
package context // import "golang.org/x/net/context"
import (
"errors"
"fmt"
"sync"
"time"
)
import "time"
// A Context carries a deadline, a cancelation signal, and other values across
// API boundaries.
@ -66,7 +61,7 @@ type Context interface {
//
// // Stream generates values with DoSomething and sends them to out
// // until DoSomething returns an error or ctx.Done is closed.
// func Stream(ctx context.Context, out <-chan Value) error {
// func Stream(ctx context.Context, out chan<- Value) error {
// for {
// v, err := DoSomething(ctx)
// if err != nil {
@ -138,48 +133,6 @@ type Context interface {
Value(key interface{}) interface{}
}
// Canceled is the error returned by Context.Err when the context is canceled.
var Canceled = errors.New("context canceled")
// DeadlineExceeded is the error returned by Context.Err when the context's
// deadline passes.
var DeadlineExceeded = errors.New("context deadline exceeded")
// An emptyCtx is never canceled, has no values, and has no deadline. It is not
// struct{}, since vars of this type must have distinct addresses.
type emptyCtx int
func (*emptyCtx) Deadline() (deadline time.Time, ok bool) {
return
}
func (*emptyCtx) Done() <-chan struct{} {
return nil
}
func (*emptyCtx) Err() error {
return nil
}
func (*emptyCtx) Value(key interface{}) interface{} {
return nil
}
func (e *emptyCtx) String() string {
switch e {
case background:
return "context.Background"
case todo:
return "context.TODO"
}
return "unknown empty Context"
}
var (
background = new(emptyCtx)
todo = new(emptyCtx)
)
// Background returns a non-nil, empty Context. It is never canceled, has no
// values, and has no deadline. It is typically used by the main function,
// initialization, and tests, and as the top-level Context for incoming
@ -201,247 +154,3 @@ func TODO() Context {
// A CancelFunc does not wait for the work to stop.
// After the first call, subsequent calls to a CancelFunc do nothing.
type CancelFunc func()
// WithCancel returns a copy of parent with a new Done channel. The returned
// context's Done channel is closed when the returned cancel function is called
// or when the parent context's Done channel is closed, whichever happens first.
//
// Canceling this context releases resources associated with it, so code should
// call cancel as soon as the operations running in this Context complete.
func WithCancel(parent Context) (ctx Context, cancel CancelFunc) {
c := newCancelCtx(parent)
propagateCancel(parent, &c)
return &c, func() { c.cancel(true, Canceled) }
}
// newCancelCtx returns an initialized cancelCtx.
func newCancelCtx(parent Context) cancelCtx {
return cancelCtx{
Context: parent,
done: make(chan struct{}),
}
}
// propagateCancel arranges for child to be canceled when parent is.
func propagateCancel(parent Context, child canceler) {
if parent.Done() == nil {
return // parent is never canceled
}
if p, ok := parentCancelCtx(parent); ok {
p.mu.Lock()
if p.err != nil {
// parent has already been canceled
child.cancel(false, p.err)
} else {
if p.children == nil {
p.children = make(map[canceler]bool)
}
p.children[child] = true
}
p.mu.Unlock()
} else {
go func() {
select {
case <-parent.Done():
child.cancel(false, parent.Err())
case <-child.Done():
}
}()
}
}
// parentCancelCtx follows a chain of parent references until it finds a
// *cancelCtx. This function understands how each of the concrete types in this
// package represents its parent.
func parentCancelCtx(parent Context) (*cancelCtx, bool) {
for {
switch c := parent.(type) {
case *cancelCtx:
return c, true
case *timerCtx:
return &c.cancelCtx, true
case *valueCtx:
parent = c.Context
default:
return nil, false
}
}
}
// removeChild removes a context from its parent.
func removeChild(parent Context, child canceler) {
p, ok := parentCancelCtx(parent)
if !ok {
return
}
p.mu.Lock()
if p.children != nil {
delete(p.children, child)
}
p.mu.Unlock()
}
// A canceler is a context type that can be canceled directly. The
// implementations are *cancelCtx and *timerCtx.
type canceler interface {
cancel(removeFromParent bool, err error)
Done() <-chan struct{}
}
// A cancelCtx can be canceled. When canceled, it also cancels any children
// that implement canceler.
type cancelCtx struct {
Context
done chan struct{} // closed by the first cancel call.
mu sync.Mutex
children map[canceler]bool // set to nil by the first cancel call
err error // set to non-nil by the first cancel call
}
func (c *cancelCtx) Done() <-chan struct{} {
return c.done
}
func (c *cancelCtx) Err() error {
c.mu.Lock()
defer c.mu.Unlock()
return c.err
}
func (c *cancelCtx) String() string {
return fmt.Sprintf("%v.WithCancel", c.Context)
}
// cancel closes c.done, cancels each of c's children, and, if
// removeFromParent is true, removes c from its parent's children.
func (c *cancelCtx) cancel(removeFromParent bool, err error) {
if err == nil {
panic("context: internal error: missing cancel error")
}
c.mu.Lock()
if c.err != nil {
c.mu.Unlock()
return // already canceled
}
c.err = err
close(c.done)
for child := range c.children {
// NOTE: acquiring the child's lock while holding parent's lock.
child.cancel(false, err)
}
c.children = nil
c.mu.Unlock()
if removeFromParent {
removeChild(c.Context, c)
}
}
// WithDeadline returns a copy of the parent context with the deadline adjusted
// to be no later than d. If the parent's deadline is already earlier than d,
// WithDeadline(parent, d) is semantically equivalent to parent. The returned
// context's Done channel is closed when the deadline expires, when the returned
// cancel function is called, or when the parent context's Done channel is
// closed, whichever happens first.
//
// Canceling this context releases resources associated with it, so code should
// call cancel as soon as the operations running in this Context complete.
func WithDeadline(parent Context, deadline time.Time) (Context, CancelFunc) {
if cur, ok := parent.Deadline(); ok && cur.Before(deadline) {
// The current deadline is already sooner than the new one.
return WithCancel(parent)
}
c := &timerCtx{
cancelCtx: newCancelCtx(parent),
deadline: deadline,
}
propagateCancel(parent, c)
d := deadline.Sub(time.Now())
if d <= 0 {
c.cancel(true, DeadlineExceeded) // deadline has already passed
return c, func() { c.cancel(true, Canceled) }
}
c.mu.Lock()
defer c.mu.Unlock()
if c.err == nil {
c.timer = time.AfterFunc(d, func() {
c.cancel(true, DeadlineExceeded)
})
}
return c, func() { c.cancel(true, Canceled) }
}
// A timerCtx carries a timer and a deadline. It embeds a cancelCtx to
// implement Done and Err. It implements cancel by stopping its timer then
// delegating to cancelCtx.cancel.
type timerCtx struct {
cancelCtx
timer *time.Timer // Under cancelCtx.mu.
deadline time.Time
}
func (c *timerCtx) Deadline() (deadline time.Time, ok bool) {
return c.deadline, true
}
func (c *timerCtx) String() string {
return fmt.Sprintf("%v.WithDeadline(%s [%s])", c.cancelCtx.Context, c.deadline, c.deadline.Sub(time.Now()))
}
func (c *timerCtx) cancel(removeFromParent bool, err error) {
c.cancelCtx.cancel(false, err)
if removeFromParent {
// Remove this timerCtx from its parent cancelCtx's children.
removeChild(c.cancelCtx.Context, c)
}
c.mu.Lock()
if c.timer != nil {
c.timer.Stop()
c.timer = nil
}
c.mu.Unlock()
}
// WithTimeout returns WithDeadline(parent, time.Now().Add(timeout)).
//
// Canceling this context releases resources associated with it, so code should
// call cancel as soon as the operations running in this Context complete:
//
// func slowOperationWithTimeout(ctx context.Context) (Result, error) {
// ctx, cancel := context.WithTimeout(ctx, 100*time.Millisecond)
// defer cancel() // releases resources if slowOperation completes before timeout elapses
// return slowOperation(ctx)
// }
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc) {
return WithDeadline(parent, time.Now().Add(timeout))
}
// WithValue returns a copy of parent in which the value associated with key is
// val.
//
// Use context Values only for request-scoped data that transits processes and
// APIs, not for passing optional parameters to functions.
func WithValue(parent Context, key interface{}, val interface{}) Context {
return &valueCtx{parent, key, val}
}
// A valueCtx carries a key-value pair. It implements Value for that key and
// delegates all other calls to the embedded Context.
type valueCtx struct {
Context
key, val interface{}
}
func (c *valueCtx) String() string {
return fmt.Sprintf("%v.WithValue(%#v, %#v)", c.Context, c.key, c.val)
}
func (c *valueCtx) Value(key interface{}) interface{} {
if c.key == key {
return c.val
}
return c.Context.Value(key)
}

72
cmd/vendor/golang.org/x/net/context/go17.go generated vendored Normal file
View File

@ -0,0 +1,72 @@
// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build go1.7
package context
import (
"context" // standard library's context, as of Go 1.7
"time"
)
var (
todo = context.TODO()
background = context.Background()
)
// Canceled is the error returned by Context.Err when the context is canceled.
var Canceled = context.Canceled
// DeadlineExceeded is the error returned by Context.Err when the context's
// deadline passes.
var DeadlineExceeded = context.DeadlineExceeded
// WithCancel returns a copy of parent with a new Done channel. The returned
// context's Done channel is closed when the returned cancel function is called
// or when the parent context's Done channel is closed, whichever happens first.
//
// Canceling this context releases resources associated with it, so code should
// call cancel as soon as the operations running in this Context complete.
func WithCancel(parent Context) (ctx Context, cancel CancelFunc) {
ctx, f := context.WithCancel(parent)
return ctx, CancelFunc(f)
}
// WithDeadline returns a copy of the parent context with the deadline adjusted
// to be no later than d. If the parent's deadline is already earlier than d,
// WithDeadline(parent, d) is semantically equivalent to parent. The returned
// context's Done channel is closed when the deadline expires, when the returned
// cancel function is called, or when the parent context's Done channel is
// closed, whichever happens first.
//
// Canceling this context releases resources associated with it, so code should
// call cancel as soon as the operations running in this Context complete.
func WithDeadline(parent Context, deadline time.Time) (Context, CancelFunc) {
ctx, f := context.WithDeadline(parent, deadline)
return ctx, CancelFunc(f)
}
// WithTimeout returns WithDeadline(parent, time.Now().Add(timeout)).
//
// Canceling this context releases resources associated with it, so code should
// call cancel as soon as the operations running in this Context complete:
//
// func slowOperationWithTimeout(ctx context.Context) (Result, error) {
// ctx, cancel := context.WithTimeout(ctx, 100*time.Millisecond)
// defer cancel() // releases resources if slowOperation completes before timeout elapses
// return slowOperation(ctx)
// }
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc) {
return WithDeadline(parent, time.Now().Add(timeout))
}
// WithValue returns a copy of parent in which the value associated with key is
// val.
//
// Use context Values only for request-scoped data that transits processes and
// APIs, not for passing optional parameters to functions.
func WithValue(parent Context, key interface{}, val interface{}) Context {
return context.WithValue(parent, key, val)
}

300
cmd/vendor/golang.org/x/net/context/pre_go17.go generated vendored Normal file
View File

@ -0,0 +1,300 @@
// Copyright 2014 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build !go1.7
package context
import (
"errors"
"fmt"
"sync"
"time"
)
// An emptyCtx is never canceled, has no values, and has no deadline. It is not
// struct{}, since vars of this type must have distinct addresses.
type emptyCtx int
func (*emptyCtx) Deadline() (deadline time.Time, ok bool) {
return
}
func (*emptyCtx) Done() <-chan struct{} {
return nil
}
func (*emptyCtx) Err() error {
return nil
}
func (*emptyCtx) Value(key interface{}) interface{} {
return nil
}
func (e *emptyCtx) String() string {
switch e {
case background:
return "context.Background"
case todo:
return "context.TODO"
}
return "unknown empty Context"
}
var (
background = new(emptyCtx)
todo = new(emptyCtx)
)
// Canceled is the error returned by Context.Err when the context is canceled.
var Canceled = errors.New("context canceled")
// DeadlineExceeded is the error returned by Context.Err when the context's
// deadline passes.
var DeadlineExceeded = errors.New("context deadline exceeded")
// WithCancel returns a copy of parent with a new Done channel. The returned
// context's Done channel is closed when the returned cancel function is called
// or when the parent context's Done channel is closed, whichever happens first.
//
// Canceling this context releases resources associated with it, so code should
// call cancel as soon as the operations running in this Context complete.
func WithCancel(parent Context) (ctx Context, cancel CancelFunc) {
c := newCancelCtx(parent)
propagateCancel(parent, c)
return c, func() { c.cancel(true, Canceled) }
}
// newCancelCtx returns an initialized cancelCtx.
func newCancelCtx(parent Context) *cancelCtx {
return &cancelCtx{
Context: parent,
done: make(chan struct{}),
}
}
// propagateCancel arranges for child to be canceled when parent is.
func propagateCancel(parent Context, child canceler) {
if parent.Done() == nil {
return // parent is never canceled
}
if p, ok := parentCancelCtx(parent); ok {
p.mu.Lock()
if p.err != nil {
// parent has already been canceled
child.cancel(false, p.err)
} else {
if p.children == nil {
p.children = make(map[canceler]bool)
}
p.children[child] = true
}
p.mu.Unlock()
} else {
go func() {
select {
case <-parent.Done():
child.cancel(false, parent.Err())
case <-child.Done():
}
}()
}
}
// parentCancelCtx follows a chain of parent references until it finds a
// *cancelCtx. This function understands how each of the concrete types in this
// package represents its parent.
func parentCancelCtx(parent Context) (*cancelCtx, bool) {
for {
switch c := parent.(type) {
case *cancelCtx:
return c, true
case *timerCtx:
return c.cancelCtx, true
case *valueCtx:
parent = c.Context
default:
return nil, false
}
}
}
// removeChild removes a context from its parent.
func removeChild(parent Context, child canceler) {
p, ok := parentCancelCtx(parent)
if !ok {
return
}
p.mu.Lock()
if p.children != nil {
delete(p.children, child)
}
p.mu.Unlock()
}
// A canceler is a context type that can be canceled directly. The
// implementations are *cancelCtx and *timerCtx.
type canceler interface {
cancel(removeFromParent bool, err error)
Done() <-chan struct{}
}
// A cancelCtx can be canceled. When canceled, it also cancels any children
// that implement canceler.
type cancelCtx struct {
Context
done chan struct{} // closed by the first cancel call.
mu sync.Mutex
children map[canceler]bool // set to nil by the first cancel call
err error // set to non-nil by the first cancel call
}
func (c *cancelCtx) Done() <-chan struct{} {
return c.done
}
func (c *cancelCtx) Err() error {
c.mu.Lock()
defer c.mu.Unlock()
return c.err
}
func (c *cancelCtx) String() string {
return fmt.Sprintf("%v.WithCancel", c.Context)
}
// cancel closes c.done, cancels each of c's children, and, if
// removeFromParent is true, removes c from its parent's children.
func (c *cancelCtx) cancel(removeFromParent bool, err error) {
if err == nil {
panic("context: internal error: missing cancel error")
}
c.mu.Lock()
if c.err != nil {
c.mu.Unlock()
return // already canceled
}
c.err = err
close(c.done)
for child := range c.children {
// NOTE: acquiring the child's lock while holding parent's lock.
child.cancel(false, err)
}
c.children = nil
c.mu.Unlock()
if removeFromParent {
removeChild(c.Context, c)
}
}
// WithDeadline returns a copy of the parent context with the deadline adjusted
// to be no later than d. If the parent's deadline is already earlier than d,
// WithDeadline(parent, d) is semantically equivalent to parent. The returned
// context's Done channel is closed when the deadline expires, when the returned
// cancel function is called, or when the parent context's Done channel is
// closed, whichever happens first.
//
// Canceling this context releases resources associated with it, so code should
// call cancel as soon as the operations running in this Context complete.
func WithDeadline(parent Context, deadline time.Time) (Context, CancelFunc) {
if cur, ok := parent.Deadline(); ok && cur.Before(deadline) {
// The current deadline is already sooner than the new one.
return WithCancel(parent)
}
c := &timerCtx{
cancelCtx: newCancelCtx(parent),
deadline: deadline,
}
propagateCancel(parent, c)
d := deadline.Sub(time.Now())
if d <= 0 {
c.cancel(true, DeadlineExceeded) // deadline has already passed
return c, func() { c.cancel(true, Canceled) }
}
c.mu.Lock()
defer c.mu.Unlock()
if c.err == nil {
c.timer = time.AfterFunc(d, func() {
c.cancel(true, DeadlineExceeded)
})
}
return c, func() { c.cancel(true, Canceled) }
}
// A timerCtx carries a timer and a deadline. It embeds a cancelCtx to
// implement Done and Err. It implements cancel by stopping its timer then
// delegating to cancelCtx.cancel.
type timerCtx struct {
*cancelCtx
timer *time.Timer // Under cancelCtx.mu.
deadline time.Time
}
func (c *timerCtx) Deadline() (deadline time.Time, ok bool) {
return c.deadline, true
}
func (c *timerCtx) String() string {
return fmt.Sprintf("%v.WithDeadline(%s [%s])", c.cancelCtx.Context, c.deadline, c.deadline.Sub(time.Now()))
}
func (c *timerCtx) cancel(removeFromParent bool, err error) {
c.cancelCtx.cancel(false, err)
if removeFromParent {
// Remove this timerCtx from its parent cancelCtx's children.
removeChild(c.cancelCtx.Context, c)
}
c.mu.Lock()
if c.timer != nil {
c.timer.Stop()
c.timer = nil
}
c.mu.Unlock()
}
// WithTimeout returns WithDeadline(parent, time.Now().Add(timeout)).
//
// Canceling this context releases resources associated with it, so code should
// call cancel as soon as the operations running in this Context complete:
//
// func slowOperationWithTimeout(ctx context.Context) (Result, error) {
// ctx, cancel := context.WithTimeout(ctx, 100*time.Millisecond)
// defer cancel() // releases resources if slowOperation completes before timeout elapses
// return slowOperation(ctx)
// }
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc) {
return WithDeadline(parent, time.Now().Add(timeout))
}
// WithValue returns a copy of parent in which the value associated with key is
// val.
//
// Use context Values only for request-scoped data that transits processes and
// APIs, not for passing optional parameters to functions.
func WithValue(parent Context, key interface{}, val interface{}) Context {
return &valueCtx{parent, key, val}
}
// A valueCtx carries a key-value pair. It implements Value for that key and
// delegates all other calls to the embedded Context.
type valueCtx struct {
Context
key, val interface{}
}
func (c *valueCtx) String() string {
return fmt.Sprintf("%v.WithValue(%#v, %#v)", c.Context, c.key, c.val)
}
func (c *valueCtx) Value(key interface{}) interface{} {
if c.key == key {
return c.val
}
return c.Context.Value(key)
}

View File

@ -18,6 +18,18 @@ type ClientConnPool interface {
MarkDead(*ClientConn)
}
// clientConnPoolIdleCloser is the interface implemented by ClientConnPool
// implementations which can close their idle connections.
type clientConnPoolIdleCloser interface {
ClientConnPool
closeIdleConnections()
}
var (
_ clientConnPoolIdleCloser = (*clientConnPool)(nil)
_ clientConnPoolIdleCloser = noDialClientConnPool{}
)
// TODO: use singleflight for dialing and addConnCalls?
type clientConnPool struct {
t *Transport
@ -40,7 +52,16 @@ const (
noDialOnMiss = false
)
func (p *clientConnPool) getClientConn(_ *http.Request, addr string, dialOnMiss bool) (*ClientConn, error) {
func (p *clientConnPool) getClientConn(req *http.Request, addr string, dialOnMiss bool) (*ClientConn, error) {
if isConnectionCloseRequest(req) && dialOnMiss {
// It gets its own connection.
const singleUse = true
cc, err := p.t.dialClientConn(addr, singleUse)
if err != nil {
return nil, err
}
return cc, nil
}
p.mu.Lock()
for _, cc := range p.conns[addr] {
if cc.CanTakeNewRequest() {
@ -83,7 +104,8 @@ func (p *clientConnPool) getStartDialLocked(addr string) *dialCall {
// run in its own goroutine.
func (c *dialCall) dial(addr string) {
c.res, c.err = c.p.t.dialClientConn(addr)
const singleUse = false // shared conn
c.res, c.err = c.p.t.dialClientConn(addr, singleUse)
close(c.done)
c.p.mu.Lock()
@ -223,3 +245,12 @@ func filterOutClientConn(in []*ClientConn, exclude *ClientConn) []*ClientConn {
}
return out
}
// noDialClientConnPool is an implementation of http2.ClientConnPool
// which never dials. We let the HTTP/1.1 client dial and use its TLS
// connection instead.
type noDialClientConnPool struct{ *clientConnPool }
func (p noDialClientConnPool) GetClientConn(req *http.Request, addr string) (*ClientConn, error) {
return p.getClientConn(req, addr, noDialOnMiss)
}

View File

@ -32,7 +32,7 @@ func configureTransport(t1 *http.Transport) (*Transport, error) {
t1.TLSClientConfig.NextProtos = append(t1.TLSClientConfig.NextProtos, "http/1.1")
}
upgradeFn := func(authority string, c *tls.Conn) http.RoundTripper {
addr := authorityAddr(authority)
addr := authorityAddr("https", authority)
if used, err := connPool.addConnIfNeeded(addr, t2, c); err != nil {
go c.Close()
return erringRoundTripper{err}
@ -67,15 +67,6 @@ func registerHTTPSProtocol(t *http.Transport, rt http.RoundTripper) (err error)
return nil
}
// noDialClientConnPool is an implementation of http2.ClientConnPool
// which never dials. We let the HTTP/1.1 client dial and use its TLS
// connection instead.
type noDialClientConnPool struct{ *clientConnPool }
func (p noDialClientConnPool) GetClientConn(req *http.Request, addr string) (*ClientConn, error) {
return p.getClientConn(req, addr, noDialOnMiss)
}
// noDialH2RoundTripper is a RoundTripper which only tries to complete the request
// if there's already has a cached connection to the host.
type noDialH2RoundTripper struct{ t *Transport }

View File

@ -64,9 +64,17 @@ func (e ConnectionError) Error() string { return fmt.Sprintf("connection error:
type StreamError struct {
StreamID uint32
Code ErrCode
Cause error // optional additional detail
}
func streamError(id uint32, code ErrCode) StreamError {
return StreamError{StreamID: id, Code: code}
}
func (e StreamError) Error() string {
if e.Cause != nil {
return fmt.Sprintf("stream error: stream ID %d; %v; %v", e.StreamID, e.Code, e.Cause)
}
return fmt.Sprintf("stream error: stream ID %d; %v", e.StreamID, e.Code)
}

View File

@ -15,6 +15,7 @@ import (
"sync"
"golang.org/x/net/http2/hpack"
"golang.org/x/net/lex/httplex"
)
const frameHeaderLen = 9
@ -316,10 +317,12 @@ type Framer struct {
// non-Continuation or Continuation on a different stream is
// attempted to be written.
logReads bool
logReads, logWrites bool
debugFramer *Framer // only use for logging written writes
debugFramerBuf *bytes.Buffer
debugFramer *Framer // only use for logging written writes
debugFramerBuf *bytes.Buffer
debugReadLoggerf func(string, ...interface{})
debugWriteLoggerf func(string, ...interface{})
}
func (fr *Framer) maxHeaderListSize() uint32 {
@ -354,7 +357,7 @@ func (f *Framer) endWrite() error {
byte(length>>16),
byte(length>>8),
byte(length))
if logFrameWrites {
if f.logWrites {
f.logWrite()
}
@ -377,10 +380,10 @@ func (f *Framer) logWrite() {
f.debugFramerBuf.Write(f.wbuf)
fr, err := f.debugFramer.ReadFrame()
if err != nil {
log.Printf("http2: Framer %p: failed to decode just-written frame", f)
f.debugWriteLoggerf("http2: Framer %p: failed to decode just-written frame", f)
return
}
log.Printf("http2: Framer %p: wrote %v", f, summarizeFrame(fr))
f.debugWriteLoggerf("http2: Framer %p: wrote %v", f, summarizeFrame(fr))
}
func (f *Framer) writeByte(v byte) { f.wbuf = append(f.wbuf, v) }
@ -398,9 +401,12 @@ const (
// NewFramer returns a Framer that writes frames to w and reads them from r.
func NewFramer(w io.Writer, r io.Reader) *Framer {
fr := &Framer{
w: w,
r: r,
logReads: logFrameReads,
w: w,
r: r,
logReads: logFrameReads,
logWrites: logFrameWrites,
debugReadLoggerf: log.Printf,
debugWriteLoggerf: log.Printf,
}
fr.getReadBuf = func(size uint32) []byte {
if cap(fr.readBuf) >= int(size) {
@ -453,7 +459,7 @@ func terminalReadFrameError(err error) bool {
//
// If the frame is larger than previously set with SetMaxReadFrameSize, the
// returned error is ErrFrameTooLarge. Other errors may be of type
// ConnectionError, StreamError, or anything else from from the underlying
// ConnectionError, StreamError, or anything else from the underlying
// reader.
func (fr *Framer) ReadFrame() (Frame, error) {
fr.errDetail = nil
@ -482,7 +488,7 @@ func (fr *Framer) ReadFrame() (Frame, error) {
return nil, err
}
if fr.logReads {
log.Printf("http2: Framer %p: read %v", fr, summarizeFrame(f))
fr.debugReadLoggerf("http2: Framer %p: read %v", fr, summarizeFrame(f))
}
if fh.Type == FrameHeaders && fr.ReadMetaHeaders != nil {
return fr.readMetaFrame(f.(*HeadersFrame))
@ -590,7 +596,15 @@ func parseDataFrame(fh FrameHeader, payload []byte) (Frame, error) {
return f, nil
}
var errStreamID = errors.New("invalid streamid")
var (
errStreamID = errors.New("invalid stream ID")
errDepStreamID = errors.New("invalid dependent stream ID")
errPadLength = errors.New("pad length too large")
)
func validStreamIDOrZero(streamID uint32) bool {
return streamID&(1<<31) == 0
}
func validStreamID(streamID uint32) bool {
return streamID != 0 && streamID&(1<<31) == 0
@ -599,18 +613,40 @@ func validStreamID(streamID uint32) bool {
// WriteData writes a DATA frame.
//
// It will perform exactly one Write to the underlying Writer.
// It is the caller's responsibility to not call other Write methods concurrently.
// It is the caller's responsibility not to violate the maximum frame size
// and to not call other Write methods concurrently.
func (f *Framer) WriteData(streamID uint32, endStream bool, data []byte) error {
// TODO: ignoring padding for now. will add when somebody cares.
return f.WriteDataPadded(streamID, endStream, data, nil)
}
// WriteData writes a DATA frame with optional padding.
//
// If pad is nil, the padding bit is not sent.
// The length of pad must not exceed 255 bytes.
//
// It will perform exactly one Write to the underlying Writer.
// It is the caller's responsibility not to violate the maximum frame size
// and to not call other Write methods concurrently.
func (f *Framer) WriteDataPadded(streamID uint32, endStream bool, data, pad []byte) error {
if !validStreamID(streamID) && !f.AllowIllegalWrites {
return errStreamID
}
if len(pad) > 255 {
return errPadLength
}
var flags Flags
if endStream {
flags |= FlagDataEndStream
}
if pad != nil {
flags |= FlagDataPadded
}
f.startWrite(FrameData, flags, streamID)
if pad != nil {
f.wbuf = append(f.wbuf, byte(len(pad)))
}
f.wbuf = append(f.wbuf, data...)
f.wbuf = append(f.wbuf, pad...)
return f.endWrite()
}
@ -706,7 +742,7 @@ func (f *Framer) WriteSettings(settings ...Setting) error {
return f.endWrite()
}
// WriteSettings writes an empty SETTINGS frame with the ACK bit set.
// WriteSettingsAck writes an empty SETTINGS frame with the ACK bit set.
//
// It will perform exactly one Write to the underlying Writer.
// It is the caller's responsibility to not call other Write methods concurrently.
@ -832,7 +868,7 @@ func parseWindowUpdateFrame(fh FrameHeader, p []byte) (Frame, error) {
if fh.StreamID == 0 {
return nil, ConnectionError(ErrCodeProtocol)
}
return nil, StreamError{fh.StreamID, ErrCodeProtocol}
return nil, streamError(fh.StreamID, ErrCodeProtocol)
}
return &WindowUpdateFrame{
FrameHeader: fh,
@ -913,7 +949,7 @@ func parseHeadersFrame(fh FrameHeader, p []byte) (_ Frame, err error) {
}
}
if len(p)-int(padLength) <= 0 {
return nil, StreamError{fh.StreamID, ErrCodeProtocol}
return nil, streamError(fh.StreamID, ErrCodeProtocol)
}
hf.headerFragBuf = p[:len(p)-int(padLength)]
return hf, nil
@ -977,8 +1013,8 @@ func (f *Framer) WriteHeaders(p HeadersFrameParam) error {
}
if !p.Priority.IsZero() {
v := p.Priority.StreamDep
if !validStreamID(v) && !f.AllowIllegalWrites {
return errors.New("invalid dependent stream id")
if !validStreamIDOrZero(v) && !f.AllowIllegalWrites {
return errDepStreamID
}
if p.Priority.Exclusive {
v |= 1 << 31
@ -1046,6 +1082,9 @@ func (f *Framer) WritePriority(streamID uint32, p PriorityParam) error {
if !validStreamID(streamID) && !f.AllowIllegalWrites {
return errStreamID
}
if !validStreamIDOrZero(p.StreamDep) {
return errDepStreamID
}
f.startWrite(FramePriority, 0, streamID)
v := p.StreamDep
if p.Exclusive {
@ -1385,7 +1424,10 @@ func (fr *Framer) readMetaFrame(hf *HeadersFrame) (*MetaHeadersFrame, error) {
hdec.SetEmitEnabled(true)
hdec.SetMaxStringLength(fr.maxHeaderStringLen())
hdec.SetEmitFunc(func(hf hpack.HeaderField) {
if !validHeaderFieldValue(hf.Value) {
if VerboseLogs && fr.logReads {
fr.debugReadLoggerf("http2: decoded hpack field %+v", hf)
}
if !httplex.ValidHeaderFieldValue(hf.Value) {
invalid = headerFieldValueError(hf.Value)
}
isPseudo := strings.HasPrefix(hf.Name, ":")
@ -1395,7 +1437,7 @@ func (fr *Framer) readMetaFrame(hf *HeadersFrame) (*MetaHeadersFrame, error) {
}
} else {
sawRegular = true
if !validHeaderFieldName(hf.Name) {
if !validWireHeaderFieldName(hf.Name) {
invalid = headerFieldNameError(hf.Name)
}
}
@ -1443,11 +1485,17 @@ func (fr *Framer) readMetaFrame(hf *HeadersFrame) (*MetaHeadersFrame, error) {
}
if invalid != nil {
fr.errDetail = invalid
return nil, StreamError{mh.StreamID, ErrCodeProtocol}
if VerboseLogs {
log.Printf("http2: invalid header: %v", invalid)
}
return nil, StreamError{mh.StreamID, ErrCodeProtocol, invalid}
}
if err := mh.checkPseudos(); err != nil {
fr.errDetail = err
return nil, StreamError{mh.StreamID, ErrCodeProtocol}
if VerboseLogs {
log.Printf("http2: invalid pseudo headers: %v", err)
}
return nil, StreamError{mh.StreamID, ErrCodeProtocol, err}
}
return mh, nil
}

View File

@ -1,11 +0,0 @@
// Copyright 2015 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build go1.5
package http2
import "net/http"
func requestCancel(req *http.Request) <-chan struct{} { return req.Cancel }

43
cmd/vendor/golang.org/x/net/http2/go16.go generated vendored Normal file
View File

@ -0,0 +1,43 @@
// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build go1.6
package http2
import (
"crypto/tls"
"net/http"
"time"
)
func transportExpectContinueTimeout(t1 *http.Transport) time.Duration {
return t1.ExpectContinueTimeout
}
// isBadCipher reports whether the cipher is blacklisted by the HTTP/2 spec.
func isBadCipher(cipher uint16) bool {
switch cipher {
case tls.TLS_RSA_WITH_RC4_128_SHA,
tls.TLS_RSA_WITH_3DES_EDE_CBC_SHA,
tls.TLS_RSA_WITH_AES_128_CBC_SHA,
tls.TLS_RSA_WITH_AES_256_CBC_SHA,
tls.TLS_RSA_WITH_AES_128_GCM_SHA256,
tls.TLS_RSA_WITH_AES_256_GCM_SHA384,
tls.TLS_ECDHE_ECDSA_WITH_RC4_128_SHA,
tls.TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,
tls.TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,
tls.TLS_ECDHE_RSA_WITH_RC4_128_SHA,
tls.TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,
tls.TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,
tls.TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA:
// Reject cipher suites from Appendix A.
// "This list includes those cipher suites that do not
// offer an ephemeral key exchange and those that are
// based on the TLS null, stream or block cipher type"
return true
default:
return false
}
}

106
cmd/vendor/golang.org/x/net/http2/go17.go generated vendored Normal file
View File

@ -0,0 +1,106 @@
// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build go1.7
package http2
import (
"context"
"net"
"net/http"
"net/http/httptrace"
"time"
)
type contextContext interface {
context.Context
}
func serverConnBaseContext(c net.Conn, opts *ServeConnOpts) (ctx contextContext, cancel func()) {
ctx, cancel = context.WithCancel(context.Background())
ctx = context.WithValue(ctx, http.LocalAddrContextKey, c.LocalAddr())
if hs := opts.baseConfig(); hs != nil {
ctx = context.WithValue(ctx, http.ServerContextKey, hs)
}
return
}
func contextWithCancel(ctx contextContext) (_ contextContext, cancel func()) {
return context.WithCancel(ctx)
}
func requestWithContext(req *http.Request, ctx contextContext) *http.Request {
return req.WithContext(ctx)
}
type clientTrace httptrace.ClientTrace
func reqContext(r *http.Request) context.Context { return r.Context() }
func (t *Transport) idleConnTimeout() time.Duration {
if t.t1 != nil {
return t.t1.IdleConnTimeout
}
return 0
}
func setResponseUncompressed(res *http.Response) { res.Uncompressed = true }
func traceGotConn(req *http.Request, cc *ClientConn) {
trace := httptrace.ContextClientTrace(req.Context())
if trace == nil || trace.GotConn == nil {
return
}
ci := httptrace.GotConnInfo{Conn: cc.tconn}
cc.mu.Lock()
ci.Reused = cc.nextStreamID > 1
ci.WasIdle = len(cc.streams) == 0 && ci.Reused
if ci.WasIdle && !cc.lastActive.IsZero() {
ci.IdleTime = time.Now().Sub(cc.lastActive)
}
cc.mu.Unlock()
trace.GotConn(ci)
}
func traceWroteHeaders(trace *clientTrace) {
if trace != nil && trace.WroteHeaders != nil {
trace.WroteHeaders()
}
}
func traceGot100Continue(trace *clientTrace) {
if trace != nil && trace.Got100Continue != nil {
trace.Got100Continue()
}
}
func traceWait100Continue(trace *clientTrace) {
if trace != nil && trace.Wait100Continue != nil {
trace.Wait100Continue()
}
}
func traceWroteRequest(trace *clientTrace, err error) {
if trace != nil && trace.WroteRequest != nil {
trace.WroteRequest(httptrace.WroteRequestInfo{Err: err})
}
}
func traceFirstResponseByte(trace *clientTrace) {
if trace != nil && trace.GotFirstResponseByte != nil {
trace.GotFirstResponseByte()
}
}
func requestTrace(req *http.Request) *clientTrace {
trace := httptrace.ContextClientTrace(req.Context())
return (*clientTrace)(trace)
}
// Ping sends a PING frame to the server and waits for the ack.
func (cc *ClientConn) Ping(ctx context.Context) error {
return cc.ping(ctx)
}

36
cmd/vendor/golang.org/x/net/http2/go17_not18.go generated vendored Normal file
View File

@ -0,0 +1,36 @@
// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build go1.7,!go1.8
package http2
import "crypto/tls"
// temporary copy of Go 1.7's private tls.Config.clone:
func cloneTLSConfig(c *tls.Config) *tls.Config {
return &tls.Config{
Rand: c.Rand,
Time: c.Time,
Certificates: c.Certificates,
NameToCertificate: c.NameToCertificate,
GetCertificate: c.GetCertificate,
RootCAs: c.RootCAs,
NextProtos: c.NextProtos,
ServerName: c.ServerName,
ClientAuth: c.ClientAuth,
ClientCAs: c.ClientCAs,
InsecureSkipVerify: c.InsecureSkipVerify,
CipherSuites: c.CipherSuites,
PreferServerCipherSuites: c.PreferServerCipherSuites,
SessionTicketsDisabled: c.SessionTicketsDisabled,
SessionTicketKey: c.SessionTicketKey,
ClientSessionCache: c.ClientSessionCache,
MinVersion: c.MinVersion,
MaxVersion: c.MaxVersion,
CurvePreferences: c.CurvePreferences,
DynamicRecordSizingDisabled: c.DynamicRecordSizingDisabled,
Renegotiation: c.Renegotiation,
}
}

50
cmd/vendor/golang.org/x/net/http2/go18.go generated vendored Normal file
View File

@ -0,0 +1,50 @@
// Copyright 2015 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build go1.8
package http2
import (
"crypto/tls"
"io"
"net/http"
)
func cloneTLSConfig(c *tls.Config) *tls.Config { return c.Clone() }
var _ http.Pusher = (*responseWriter)(nil)
// Push implements http.Pusher.
func (w *responseWriter) Push(target string, opts *http.PushOptions) error {
internalOpts := pushOptions{}
if opts != nil {
internalOpts.Method = opts.Method
internalOpts.Header = opts.Header
}
return w.push(target, internalOpts)
}
func configureServer18(h1 *http.Server, h2 *Server) error {
if h2.IdleTimeout == 0 {
if h1.IdleTimeout != 0 {
h2.IdleTimeout = h1.IdleTimeout
} else {
h2.IdleTimeout = h1.ReadTimeout
}
}
return nil
}
func shouldLogPanic(panicValue interface{}) bool {
return panicValue != nil && panicValue != http.ErrAbortHandler
}
func reqGetBody(req *http.Request) func() (io.ReadCloser, error) {
return req.GetBody
}
func reqBodyIsNoBody(body io.ReadCloser) bool {
return body == http.NoBody
}

View File

@ -43,7 +43,7 @@ type HeaderField struct {
// IsPseudo reports whether the header field is an http2 pseudo header.
// That is, it reports whether it starts with a colon.
// It is not otherwise guaranteed to be a valid psuedo header field,
// It is not otherwise guaranteed to be a valid pseudo header field,
// though.
func (hf HeaderField) IsPseudo() bool {
return len(hf.Name) != 0 && hf.Name[0] == ':'
@ -57,7 +57,7 @@ func (hf HeaderField) String() string {
return fmt.Sprintf("header field %q = %q%s", hf.Name, hf.Value, suffix)
}
// Size returns the size of an entry per RFC 7540 section 5.2.
// Size returns the size of an entry per RFC 7541 section 4.1.
func (hf HeaderField) Size() uint32 {
// http://http2.github.io/http2-spec/compression.html#rfc.section.4.1
// "The size of the dynamic table is the sum of the size of

View File

@ -48,12 +48,16 @@ var ErrInvalidHuffman = errors.New("hpack: invalid Huffman-encoded data")
// maxLen bytes will return ErrStringLength.
func huffmanDecode(buf *bytes.Buffer, maxLen int, v []byte) error {
n := rootHuffmanNode
cur, nbits := uint(0), uint8(0)
// cur is the bit buffer that has not been fed into n.
// cbits is the number of low order bits in cur that are valid.
// sbits is the number of bits of the symbol prefix being decoded.
cur, cbits, sbits := uint(0), uint8(0), uint8(0)
for _, b := range v {
cur = cur<<8 | uint(b)
nbits += 8
for nbits >= 8 {
idx := byte(cur >> (nbits - 8))
cbits += 8
sbits += 8
for cbits >= 8 {
idx := byte(cur >> (cbits - 8))
n = n.children[idx]
if n == nil {
return ErrInvalidHuffman
@ -63,22 +67,40 @@ func huffmanDecode(buf *bytes.Buffer, maxLen int, v []byte) error {
return ErrStringLength
}
buf.WriteByte(n.sym)
nbits -= n.codeLen
cbits -= n.codeLen
n = rootHuffmanNode
sbits = cbits
} else {
nbits -= 8
cbits -= 8
}
}
}
for nbits > 0 {
n = n.children[byte(cur<<(8-nbits))]
if n.children != nil || n.codeLen > nbits {
for cbits > 0 {
n = n.children[byte(cur<<(8-cbits))]
if n == nil {
return ErrInvalidHuffman
}
if n.children != nil || n.codeLen > cbits {
break
}
if maxLen != 0 && buf.Len() == maxLen {
return ErrStringLength
}
buf.WriteByte(n.sym)
nbits -= n.codeLen
cbits -= n.codeLen
n = rootHuffmanNode
sbits = cbits
}
if sbits > 7 {
// Either there was an incomplete symbol, or overlong padding.
// Both are decoding errors per RFC 7541 section 5.2.
return ErrInvalidHuffman
}
if mask := uint(1<<cbits - 1); cur&mask != mask {
// Trailing bits must be a prefix of EOS per RFC 7541 section 5.2.
return ErrInvalidHuffman
}
return nil
}

View File

@ -13,7 +13,8 @@
// See https://http2.github.io/ for more information on HTTP/2.
//
// See https://http2.golang.org/ for a test server running this code.
package http2
//
package http2 // import "golang.org/x/net/http2"
import (
"bufio"
@ -23,15 +24,19 @@ import (
"io"
"net/http"
"os"
"sort"
"strconv"
"strings"
"sync"
"golang.org/x/net/lex/httplex"
)
var (
VerboseLogs bool
logFrameWrites bool
logFrameReads bool
inTests bool
)
func init() {
@ -73,13 +78,23 @@ var (
type streamState int
// HTTP/2 stream states.
//
// See http://tools.ietf.org/html/rfc7540#section-5.1.
//
// For simplicity, the server code merges "reserved (local)" into
// "half-closed (remote)". This is one less state transition to track.
// The only downside is that we send PUSH_PROMISEs slightly less
// liberally than allowable. More discussion here:
// https://lists.w3.org/Archives/Public/ietf-http-wg/2016JulSep/0599.html
//
// "reserved (remote)" is omitted since the client code does not
// support server push.
const (
stateIdle streamState = iota
stateOpen
stateHalfClosedLocal
stateHalfClosedRemote
stateResvLocal
stateResvRemote
stateClosed
)
@ -88,8 +103,6 @@ var stateName = [...]string{
stateOpen: "Open",
stateHalfClosedLocal: "HalfClosedLocal",
stateHalfClosedRemote: "HalfClosedRemote",
stateResvLocal: "ResvLocal",
stateResvRemote: "ResvRemote",
stateClosed: "Closed",
}
@ -165,57 +178,23 @@ var (
errInvalidHeaderFieldValue = errors.New("http2: invalid header field value")
)
// validHeaderFieldName reports whether v is a valid header field name (key).
// RFC 7230 says:
// header-field = field-name ":" OWS field-value OWS
// field-name = token
// tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
// "^" / "_" / "
// validWireHeaderFieldName reports whether v is a valid header field
// name (key). See httplex.ValidHeaderName for the base rules.
//
// Further, http2 says:
// "Just as in HTTP/1.x, header field names are strings of ASCII
// characters that are compared in a case-insensitive
// fashion. However, header field names MUST be converted to
// lowercase prior to their encoding in HTTP/2. "
func validHeaderFieldName(v string) bool {
func validWireHeaderFieldName(v string) bool {
if len(v) == 0 {
return false
}
for _, r := range v {
if int(r) >= len(isTokenTable) || ('A' <= r && r <= 'Z') {
if !httplex.IsTokenRune(r) {
return false
}
if !isTokenTable[byte(r)] {
return false
}
}
return true
}
// validHeaderFieldValue reports whether v is a valid header field value.
//
// RFC 7230 says:
// field-value = *( field-content / obs-fold )
// obj-fold = N/A to http2, and deprecated
// field-content = field-vchar [ 1*( SP / HTAB ) field-vchar ]
// field-vchar = VCHAR / obs-text
// obs-text = %x80-FF
// VCHAR = "any visible [USASCII] character"
//
// http2 further says: "Similarly, HTTP/2 allows header field values
// that are not valid. While most of the values that can be encoded
// will not alter header field parsing, carriage return (CR, ASCII
// 0xd), line feed (LF, ASCII 0xa), and the zero character (NUL, ASCII
// 0x0) might be exploited by an attacker if they are translated
// verbatim. Any request or response that contains a character not
// permitted in a header field value MUST be treated as malformed
// (Section 8.1.2.6). Valid characters are defined by the
// field-content ABNF rule in Section 3.2 of [RFC7230]."
//
// This function does not (yet?) properly handle the rejection of
// strings that begin or end with SP or HTAB.
func validHeaderFieldValue(v string) bool {
for i := 0; i < len(v); i++ {
if b := v[i]; b < ' ' && b != '\t' || b == 0x7f {
if 'A' <= r && r <= 'Z' {
return false
}
}
@ -283,14 +262,27 @@ func newBufferedWriter(w io.Writer) *bufferedWriter {
return &bufferedWriter{w: w}
}
// bufWriterPoolBufferSize is the size of bufio.Writer's
// buffers created using bufWriterPool.
//
// TODO: pick a less arbitrary value? this is a bit under
// (3 x typical 1500 byte MTU) at least. Other than that,
// not much thought went into it.
const bufWriterPoolBufferSize = 4 << 10
var bufWriterPool = sync.Pool{
New: func() interface{} {
// TODO: pick something better? this is a bit under
// (3 x typical 1500 byte MTU) at least.
return bufio.NewWriterSize(nil, 4<<10)
return bufio.NewWriterSize(nil, bufWriterPoolBufferSize)
},
}
func (w *bufferedWriter) Available() int {
if w.bw == nil {
return bufWriterPoolBufferSize
}
return w.bw.Available()
}
func (w *bufferedWriter) Write(p []byte) (n int, err error) {
if w.bw == nil {
bw := bufWriterPool.Get().(*bufio.Writer)
@ -320,7 +312,7 @@ func mustUint31(v int32) uint32 {
}
// bodyAllowedForStatus reports whether a given response status code
// permits a body. See RFC2616, section 4.4.
// permits a body. See RFC 2616, section 4.4.
func bodyAllowedForStatus(status int) bool {
switch {
case status >= 100 && status <= 199:
@ -344,86 +336,52 @@ func (e *httpError) Temporary() bool { return true }
var errTimeout error = &httpError{msg: "http2: timeout awaiting response headers", timeout: true}
var isTokenTable = [127]bool{
'!': true,
'#': true,
'$': true,
'%': true,
'&': true,
'\'': true,
'*': true,
'+': true,
'-': true,
'.': true,
'0': true,
'1': true,
'2': true,
'3': true,
'4': true,
'5': true,
'6': true,
'7': true,
'8': true,
'9': true,
'A': true,
'B': true,
'C': true,
'D': true,
'E': true,
'F': true,
'G': true,
'H': true,
'I': true,
'J': true,
'K': true,
'L': true,
'M': true,
'N': true,
'O': true,
'P': true,
'Q': true,
'R': true,
'S': true,
'T': true,
'U': true,
'W': true,
'V': true,
'X': true,
'Y': true,
'Z': true,
'^': true,
'_': true,
'`': true,
'a': true,
'b': true,
'c': true,
'd': true,
'e': true,
'f': true,
'g': true,
'h': true,
'i': true,
'j': true,
'k': true,
'l': true,
'm': true,
'n': true,
'o': true,
'p': true,
'q': true,
'r': true,
's': true,
't': true,
'u': true,
'v': true,
'w': true,
'x': true,
'y': true,
'z': true,
'|': true,
'~': true,
}
type connectionStater interface {
ConnectionState() tls.ConnectionState
}
var sorterPool = sync.Pool{New: func() interface{} { return new(sorter) }}
type sorter struct {
v []string // owned by sorter
}
func (s *sorter) Len() int { return len(s.v) }
func (s *sorter) Swap(i, j int) { s.v[i], s.v[j] = s.v[j], s.v[i] }
func (s *sorter) Less(i, j int) bool { return s.v[i] < s.v[j] }
// Keys returns the sorted keys of h.
//
// The returned slice is only valid until s used again or returned to
// its pool.
func (s *sorter) Keys(h http.Header) []string {
keys := s.v[:0]
for k := range h {
keys = append(keys, k)
}
s.v = keys
sort.Sort(s)
return keys
}
func (s *sorter) SortStrings(ss []string) {
// Our sorter works on s.v, which sorter owns, so
// stash it away while we sort the user's buffer.
save := s.v
s.v = ss
sort.Sort(s)
s.v = save
}
// validPseudoPath reports whether v is a valid :path pseudo-header
// value. It must be either:
//
// *) a non-empty string starting with '/', but not with with "//",
// *) the string '*', for OPTIONS requests.
//
// For now this is only used a quick check for deciding when to clean
// up Opaque URLs before sending requests from the Transport.
// See golang.org/issue/16847
func validPseudoPath(v string) bool {
return (len(v) > 0 && v[0] == '/' && (len(v) == 1 || v[1] != '/')) || v == "*"
}

View File

@ -1,11 +0,0 @@
// Copyright 2015 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build !go1.5
package http2
import "net/http"
func requestCancel(req *http.Request) <-chan struct{} { return nil }

View File

@ -6,8 +6,41 @@
package http2
import "net/http"
import (
"crypto/tls"
"net/http"
"time"
)
func configureTransport(t1 *http.Transport) (*Transport, error) {
return nil, errTransportVersion
}
func transportExpectContinueTimeout(t1 *http.Transport) time.Duration {
return 0
}
// isBadCipher reports whether the cipher is blacklisted by the HTTP/2 spec.
func isBadCipher(cipher uint16) bool {
switch cipher {
case tls.TLS_RSA_WITH_RC4_128_SHA,
tls.TLS_RSA_WITH_3DES_EDE_CBC_SHA,
tls.TLS_RSA_WITH_AES_128_CBC_SHA,
tls.TLS_RSA_WITH_AES_256_CBC_SHA,
tls.TLS_ECDHE_ECDSA_WITH_RC4_128_SHA,
tls.TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,
tls.TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,
tls.TLS_ECDHE_RSA_WITH_RC4_128_SHA,
tls.TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,
tls.TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,
tls.TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA:
// Reject cipher suites from Appendix A.
// "This list includes those cipher suites that do not
// offer an ephemeral key exchange and those that are
// based on the TLS null, stream or block cipher type"
return true
default:
return false
}
}

87
cmd/vendor/golang.org/x/net/http2/not_go17.go generated vendored Normal file
View File

@ -0,0 +1,87 @@
// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build !go1.7
package http2
import (
"crypto/tls"
"net"
"net/http"
"time"
)
type contextContext interface {
Done() <-chan struct{}
Err() error
}
type fakeContext struct{}
func (fakeContext) Done() <-chan struct{} { return nil }
func (fakeContext) Err() error { panic("should not be called") }
func reqContext(r *http.Request) fakeContext {
return fakeContext{}
}
func setResponseUncompressed(res *http.Response) {
// Nothing.
}
type clientTrace struct{}
func requestTrace(*http.Request) *clientTrace { return nil }
func traceGotConn(*http.Request, *ClientConn) {}
func traceFirstResponseByte(*clientTrace) {}
func traceWroteHeaders(*clientTrace) {}
func traceWroteRequest(*clientTrace, error) {}
func traceGot100Continue(trace *clientTrace) {}
func traceWait100Continue(trace *clientTrace) {}
func nop() {}
func serverConnBaseContext(c net.Conn, opts *ServeConnOpts) (ctx contextContext, cancel func()) {
return nil, nop
}
func contextWithCancel(ctx contextContext) (_ contextContext, cancel func()) {
return ctx, nop
}
func requestWithContext(req *http.Request, ctx contextContext) *http.Request {
return req
}
// temporary copy of Go 1.6's private tls.Config.clone:
func cloneTLSConfig(c *tls.Config) *tls.Config {
return &tls.Config{
Rand: c.Rand,
Time: c.Time,
Certificates: c.Certificates,
NameToCertificate: c.NameToCertificate,
GetCertificate: c.GetCertificate,
RootCAs: c.RootCAs,
NextProtos: c.NextProtos,
ServerName: c.ServerName,
ClientAuth: c.ClientAuth,
ClientCAs: c.ClientCAs,
InsecureSkipVerify: c.InsecureSkipVerify,
CipherSuites: c.CipherSuites,
PreferServerCipherSuites: c.PreferServerCipherSuites,
SessionTicketsDisabled: c.SessionTicketsDisabled,
SessionTicketKey: c.SessionTicketKey,
ClientSessionCache: c.ClientSessionCache,
MinVersion: c.MinVersion,
MaxVersion: c.MaxVersion,
CurvePreferences: c.CurvePreferences,
}
}
func (cc *ClientConn) Ping(ctx contextContext) error {
return cc.ping(ctx)
}
func (t *Transport) idleConnTimeout() time.Duration { return 0 }

27
cmd/vendor/golang.org/x/net/http2/not_go18.go generated vendored Normal file
View File

@ -0,0 +1,27 @@
// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build !go1.8
package http2
import (
"io"
"net/http"
)
func configureServer18(h1 *http.Server, h2 *Server) error {
// No IdleTimeout to sync prior to Go 1.8.
return nil
}
func shouldLogPanic(panicValue interface{}) bool {
return panicValue != nil
}
func reqGetBody(req *http.Request) func() (io.ReadCloser, error) {
return nil
}
func reqBodyIsNoBody(io.ReadCloser) bool { return false }

View File

@ -29,6 +29,12 @@ type pipeBuffer interface {
io.Reader
}
func (p *pipe) Len() int {
p.mu.Lock()
defer p.mu.Unlock()
return p.b.Len()
}
// Read waits until data is available and copies bytes
// from the buffer into p.
func (p *pipe) Read(d []byte) (n int, err error) {

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More