Commit Graph

224 Commits

Author SHA1 Message Date
3221454cab etcdserver: remove possibly compacted entry look-up
Fix https://github.com/coreos/etcd/issues/7470.

This patch removes unnecessary term look-up in
'createMergedSnapshotMessage', which can trigger panic
if raft entry at etcdProgress.appliedi got compacted
by subsequent 'MsgSnap' messages--if a follower is
being (in this case, network latency spikes) slow, it
could receive subsequent 'MsgSnap' requests from leader.

etcd server-side 'applyAll' routine and raft's Ready
processing routine becomes asynchronous after raft
entries are persisted. And given that raft Ready routine
takes less time to finish, it is possible that second
'MsgSnap' is being handled, while the slow 'applyAll'
is still processing the first(old) 'MsgSnap'. Then raft
Ready routine can compact the log entries at future
index to 'applyAll'. That is how 'createMergedSnapshotMessage'
tried to look up raft term with outdated etcdProgress.appliedi.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-18 07:56:18 -07:00
6a0a0a7ea1 etcdserver: make snaptest fail fast 2016-11-03 14:44:08 -07:00
136c02da71 Merge pull request #6738 from gyuho/raft-cleanup
etcdserver: move 'EtcdServer.send' to raft.go
2016-10-31 15:15:08 -07:00
5bd00ab1f6 *: fix minor typos 2016-10-31 09:47:15 -07:00
6ec03d3f7c etcdserver: move 'EtcdServer.send' to raft.go
Clear 'TODO'
2016-10-26 16:26:00 -07:00
0c61d8804a etcdserver: make WaitGroup.Add sync with Wait 2016-10-12 13:11:35 -07:00
289e3c0c63 etcdserver: use stream recorder for TestPublishRetry
Fixes #6546
2016-09-30 15:43:32 -07:00
3866e78c26 etcdserver: tighten up goroutine management
All outstanding goroutines now go into the etcdserver waitgroup. goroutines are
shutdown with a "stopping" channel which is closed when the run() goroutine
shutsdown. The done channel will only close once the waitgroup is totally cleared.
2016-09-19 12:10:41 -07:00
a56cb82180 etcdserver: add TransferLeadership for raft.Node 2016-08-10 16:26:11 -07:00
cfe09d34b8 etcdserver: don't race when waiting for store in TestSnapshot 2016-07-27 15:37:27 -07:00
1c5754f02d raft: fix readindex 2016-07-19 15:00:58 -07:00
3451623c71 etcdserver: fix TestSnap 2016-07-06 10:30:15 -07:00
6f2e7875aa etcdctl: add migrate command
Migrate command accepts a datadir and an optional user-provided
transformer function that transform v2 keys to v2 keys.

Migrate command then builds a v3 backend state based on the existing
v2 keys and the output of the transformer function.
2016-05-19 12:17:15 -07:00
abb4cd5646 etcdserver: update LICENSE header 2016-05-12 20:49:40 -07:00
9c103dd0de *: cancel required leader streams when memeber lost its leader 2016-05-12 19:42:21 -07:00
434f2c356d etcdserver: do not serve requests before finish the first internal proposal 2016-04-27 15:46:31 -07:00
b7ac758969 *: rename storage package to mvcc 2016-04-25 15:25:51 -07:00
41382bc3f0 etcdserver: split out v2 raft apply interface 2016-04-20 10:29:22 -07:00
641a1a66e1 *: fix govet -shadow in go tip 2016-04-15 07:39:52 -07:00
b13b77f362 membership: update attr in membership pkg 2016-04-07 21:25:32 -07:00
bf2289ae00 etcdserver: move membership related code to membership pkg 2016-04-07 14:21:37 -07:00
b0cc0e443c *: clean up if, bool comparison 2016-04-02 12:55:11 -07:00
70a9391378 *: enable v3 by default 2016-03-23 17:01:36 -07:00
bd832e5b0a *: migrate Godeps to vendor/ 2016-03-22 17:10:28 -07:00
dcaf5ef586 move store recorder to 'mock/mockstore' 2016-03-15 15:41:07 -07:00
a524d5bdb7 etcdserver: fix race in TestTriggerSnap
Fixes #4584
2016-02-21 22:03:35 -08:00
82778ed478 Add refresh parameter to allow TTL refreshes without firing watch/wait responses 2016-02-08 10:37:37 -07:00
082a6c304e etcdserver/test: use recorderstream in TestApplyRepeat
was racing when waiting for the node commit

fixes #4333
2016-01-28 17:19:06 -08:00
64596f0c49 etcdserver/test: synchronously wait on TestApplySnapshotAndCommittedEntries
Replaces the RecorderBuffered with a RecorderStream so Wait will block
waiting for updates to the etcdserver store.

Fixes #4296
2016-01-26 21:03:03 -08:00
bd02d668c8 etcdserver: don't try to apply empty message list
If all messages have been applied, don't apply an empty messages list;
otherwise appliedi will update to 0 and etcd will panic.

Fixes #4278
2016-01-26 11:56:37 -08:00
f5753f2f51 *: support lease Attach
Now we can attach keys to leases. And revoking the lease removes all
the attached keys of that lease.
2016-01-09 11:01:58 -08:00
1714290f4e storage: support recovering from backend
We want the KV to support recovering from backend to avoid
additional pointer swap. Or we have to do coordination between
etcdserver and API layer, since API layer might have access to
kv pointer and use a closed kv.
2016-01-06 21:16:55 -08:00
5dd3f91903 *: make backend outside kv
KV and lease will share the same backend. Thus we need to make
backend outside KV.
2016-01-05 19:55:29 -08:00
838328b057 etcdserver: fix racey WaitSchedule() tests to wait for recorder actions
Fixes #4119
2016-01-05 09:39:18 -08:00
384cc76299 pkg/testutil: make Recorder an interface
Provides two implementations of Recorder-- one that is non-blocking
like the original version and one that provides a blocking channel
to avoid busy waiting or racing in tests when no other synchronization
is available.
2016-01-05 09:39:18 -08:00
e1bf726bc1 *: split out etcdserver's test mockup objects to live in interfaces' packages 2016-01-05 09:39:13 -08:00
4cd86ae1ef etcdserver: serialize snapshot merger with applier
Avoids inconsistent snapshotting by only attempting to
create a snapshot after an apply completes.

Fixes #4061
2015-12-29 18:38:39 -08:00
d7ad721ede etcdserver: stop if removed along with multiple conf changes
shouldstop would get clobbered when several conf changes are in an apply
2015-12-23 16:29:21 -08:00
23bd60ccce *: rewrite snapshot sending 2015-12-08 18:21:21 -08:00
0708a5e50d etcdserver: refactor a for loop in recvSnap test 2015-12-02 15:41:03 -08:00
3ec3ffbef0 etcdserver: get rid of unreliable WaitSchedule
In this case, we know we are waiting for an action happened on
storage. We can do a busy wait instead of calling waitSchedule.

The test previously failed on CI with no observed actions.
2015-12-02 13:18:11 -08:00
7d757bbc8a etcdserver: extend wait timeout in TestPublishRetry
It fixes the failure in semaphore CI:
```
--- FAIL: TestPublishRetry (0.00s)
		server_test.go:1108: len(action) = 1, want >= 2
```
2015-10-28 12:07:00 -07:00
cacc0d6432 etcdserver: restore KV snapshot when receiving snapshot
When a slow follower receives the snapshot sent from the leader, it
should rename the snapshot file to the default KV file path, and
restore KV snapshot.

Have tested it manually and it works pretty well.
2015-10-23 08:43:26 -07:00
ab5df57ecf etcdserver: fix raft state machine may block
When snapshot store requests raft snapshot from etcdserver apply loop,
it may block on the channel for some time, or wait some time for KV to
snapshot. This is unexpected because raft state machine should be unblocked.

Even worse, this block may lead to deadlock:
1. raft state machine waits on getting snapshot from raft memory storage
2. raft memory storage waits snapshot store to get snapshot
3. snapshot store requests raft snapshot from apply loop
4. apply loop is applying entries, and waits raftNode loop to finish
messages sending
5. raftNode loop waits peer loop in Transport to send out messages
6. peer loop in Transport waits for raft state machine to process message

Fix it by changing the logic of getSnap to be asynchronously creation.
2015-10-20 09:19:34 -07:00
1f21ccf166 rafthttp: support sending v3 snapshot message
Use snapshotSender to send v3 snapshot message. It puts raft snapshot
message and v3 snapshot into request body, then sends it to the target peer.
When it receives http.StatusNoContent, it knows the message has been
received and processed successfully.

As receiver, snapHandler saves v3 snapshot and then processes the raft snapshot
message, then respond with http.StatusNoContent.
2015-10-13 23:11:28 -07:00
207c92b627 rafthttp: build transport inside pkg instead of passed-in
rafthttp has different requirements for connections created by the
transport for different usage, and this is hard to achieve when giving
one http.RoundTripper. Pass into pkg the data needed to build transport
now, and let rafthttp build its own transports.
2015-10-11 21:42:37 -07:00
233e717e2f rafthttp: expose struct to set configuration
transport takes too many arguments and the new function is unable to
read. Change the way to set fields in transport struct directly.
2015-10-11 09:02:16 -07:00
f74ff9b867 Merge pull request #3644 from mitake/test-race
etcdserver, test: don't access testing.T in time.AfterFunc()'s own go…
2015-10-07 08:34:58 -07:00
68dd3ee621 etcdserver, test: don't access testing.T in time.AfterFunc()'s own goroutine
time.AfterFunc() creates its own goroutine and calls the callback
function in the goroutine. It can cause datarace like the problem
fixed in the commit de1a16e0f1 . This
commit also fixes the potential dataraces of tests in
etcdserver/server_test.go .
2015-10-06 11:37:08 +09:00
bfe9502f4f etcdserver: support to create raft snapshot at apply loop
and snapStore could trigger it to create the latest raft snapshot.
2015-10-02 13:17:56 -07:00