Commit Graph

207 Commits

Author SHA1 Message Date
003b97a60f raft: public progress struct in raft 2015-01-20 10:26:22 -08:00
2e1c36cdd9 raft: introduce MsgHeartbeatResp.
Now that heartbeats are distinct from MsgApp{,Resp}, the retries
currently performed in stepLeader's MsgAppResp section are only
performed on an actual MsgAppResp (or a new MsgProp). This means
that it may take a long time to recover from a dropped MsgAppResp
in a quiet cluster.

This commit adds a dedicated heartbeat response message. This message
does not convey the follower's current log position because the
MsgHeartbeat does not include the leaders term and index. Upon receipt
of a heartbeat response, the leader may retry the latest MsgApp if it
believes the follower to be behind.
2015-01-14 17:34:10 -05:00
9972e62d94 raft: Use <= instead of < for heartbeat ticks.
In code outside the raft package, we cannot call raft.bcastHeartbeat
directly. Instead, to control heartbeats we set heartbeatInterval to 1
and call Tick().
2015-01-14 15:27:32 -05:00
35b907ac58 raft: add lastIndex as rejectHint
Add the lastindex of the raft log as reject hint, so the leader can
bypass the greater index probing and decrease the next index directly
to last + 1.
2015-01-01 19:04:07 -08:00
fc96a9e4a7 raft: remove unnecessary funcs in raft.go 2014-12-25 17:04:33 -08:00
88767d913d raft: leader waits for the reply of previous message when follower is not in good path.
It is reasonable for the leader to wait for the reply before sending out the next
msgApp or msgSnap for the follower in bad path. Or the leader will send out useless
messages if the previous message is rejected or the previous message is a snapshot.
Especially for the snapshot case, the leader will be 100% to send out duplicate message
including the snapshot, which is a huge waste.

This commit implement a timeout based wait mechanism. The timeout for normal msgApp is a
heartbeatTimeout and the timeout for snapshot is electionTimeout(snapshot is larger). We
can implement a piggyback mechanism(application notifies the msg lost) in the future
if necessary.
2014-12-18 15:01:50 -08:00
044e35b814 raft: use newRaft 2014-12-15 11:25:35 -08:00
c586d5012c raft: log term as %d 2014-12-14 10:06:45 -08:00
2c2e032155 Merge pull request #1908 from bdarnell/error-fixes
raft: remove panic when we see a proposal with no leader.
2014-12-11 13:58:51 -08:00
b26856b603 raft: add detail to "no leader" log message 2014-12-11 15:07:32 -05:00
89cba625d6 Merge pull request #1897 from xiang90/raft
raft: get rid of the using of defer in critical path
2014-12-10 21:24:38 -08:00
e89cc25c50 Merge pull request #1901 from yichengq/260
rafthttp: batch MsgProp
2014-12-10 21:16:07 -08:00
3867c72c8a raft: support to do multiple proposals in one message 2014-12-10 20:00:59 -08:00
fa247d09cc raft: remove panic when we see a proposal with no leader.
This panic can never be reached when using raft.Node, because we only
read from propc when there is a leader. However, it is possible to see
this error when using raft the raft object directly (as in MultiNode),
and in this case it is better to simply drop the proposal (as if we had
sent it to a leader that immediately vanished).

Add an error return to MemoryStorage.Append for consistency.
2014-12-10 17:34:40 -05:00
96de9776b7 raft: get rid of allocation 2014-12-10 13:41:04 -08:00
a5efbf826d raft: drop nodes in softState 2014-12-09 11:43:52 -08:00
ba45637ba3 raft: group step funcs 2014-12-08 15:29:54 -08:00
099f4f10ea raft: one line 2014-12-08 15:28:48 -08:00
8ead428e76 raft: group getter funcs 2014-12-08 15:24:34 -08:00
f73d059d80 raft: group configuration related funcs 2014-12-08 15:23:21 -08:00
25313b1210 raft: move poll close to campaign 2014-12-08 15:21:57 -08:00
d52c66ad42 raft: removed unused func 2014-12-08 15:20:43 -08:00
62ed1de10d raft: refactoring logging 2014-12-08 15:16:02 -08:00
6cb7f2d9e9 raft: print out log when creating a newraft 2014-12-08 14:37:39 -08:00
ea4d645a83 raft: Ignore redundant addNode calls.
This avoids clobbering any state when bootstrapping entries are
applied twice.
2014-12-05 17:15:50 -05:00
149389cbfa raft: add msgHeartbeat type 2014-12-04 08:29:31 -08:00
2caf4f5f22 raft: fix log format in sendAppend 2014-12-03 16:11:44 -08:00
06a5892a18 raft: more logging 2014-12-03 14:46:24 -08:00
63ed202db6 raft: print out term in decimal format 2014-12-03 11:33:51 -08:00
23b32a6cbe Merge pull request #1716 from yichengq/225
raft: panic if loaded commit is out of range
2014-12-02 22:14:12 -08:00
38768e5396 raft: panic if loaded commit is out of range 2014-12-02 22:09:34 -08:00
b3841afcc3 raft: do not restore snapshot if local raft has longer matching history
Raft should not restore the snapshot if it has longer matching history.
Or restoring snapshot might remove the matched entries.
2014-12-02 21:34:14 -08:00
3cadaca1a3 Merge pull request #1830 from xiang90/raft_snap_log
raft: log snapshot events
2014-12-02 12:06:15 -08:00
411063e14f raft: log snapshot events 2014-12-02 11:57:10 -08:00
788d1e59a2 raft: use index in entry 2014-12-02 10:25:27 -08:00
51de095d2c raft: logging state change events and events on bad path 2014-12-02 10:08:19 -08:00
746c66b466 raft: fix start term 2014-11-26 21:21:13 -08:00
e23f9e76d1 raft: do not applysnapshot in raft 2014-11-26 10:59:13 -08:00
8de98d4903 raft: clean up 2014-11-25 16:21:50 -08:00
7e6e305c4f Merge branch 'log_interface'
Conflicts:
	raft/raft.go
2014-11-25 14:22:11 -08:00
4b43824be9 raft: not compact log if the compact index < first index of the log
It should ignore the compact operation instead of panic because the case that
the log is restored from snapshot before executing compact is reasonable.
2014-11-25 11:51:20 -08:00
8ee1bf31d6 raft: use IsEmptySnap to check the empty snapshot 2014-11-25 00:37:21 -08:00
9455119968 raft: always check leader changes in node run loop 2014-11-24 19:07:10 -08:00
9ddd8ee539 Rename Storage.HardState back to InitialState and include ConfState.
This fixes integration/migration_test.go (and highlights the fact that
we need some more raft-level testing of restoring from snapshots).
2014-11-21 17:22:20 -05:00
0d680d0e6b Merge remote-tracking branch 'coreos/master' into merge
* coreos/master:
  rafthttp: fix import
  raft: should not decrease match and next when handling out of order msgAppResp
  Fix migration to allow snapshots to have the right IDs
  add snapshotted integration test
  fix test import loop
  fix import loop, add set to types, and fix comments
  etcdserver: autodetect v0.4 WALs and upgrade them to v0.5 automatically
  wal: add a bench for write entry
  rafthttp: add streaming server and client
  dep: use vendored imports in codegangsta/cli
  dep: bump golang.org/x/net/context

Conflicts:
	etcdserver/server.go
	etcdserver/server_test.go
	migrate/snapshot.go
2014-11-21 15:40:11 -05:00
063c5c77a0 raft: should not decrease match and next when handling out of order msgAppResp 2014-11-20 17:58:23 -08:00
b29240baf0 Merge remote-tracking branch 'coreos/master' into merge
* coreos/master:
  scripts: build-docker tag and use ENTRYPOINT
  scripts: build-release add etcd-migrate
  create .godir
  raft: optimistically increase the next if the follower is already matched
  raft: add handleHeartbeat handleHeartbeat commits to the commit index in the message. It never decreases the commit index of the raft state machine.
  rafthttp: send takes raft message instead of bytes
  *: add rafthttp pkg into test list
  raft: include commitIndex in heartbeat
  rafthttp: move server stats in raftHandler to etcdserver
  *: etcdhttp.raftHandler -> rafthttp.RaftHandler
  etcdserver: rename sender.go -> sendhub.go
  *: etcdserver.sender -> rafthttp.Sender

Conflicts:
	raft/log.go
	raft/raft_paper_test.go
2014-11-19 17:05:16 -05:00
355ee4f393 raft: Integrate snapshots into the raft.Storage interface.
Compaction is now treated as an implementation detail of Storage
implementations; Node.Compact() and related functionality have been
removed. Ready.Snapshot is now used only for incoming snapshots.

A return value has been added to ApplyConfChange to allow applications
to track the node information that must be stored in the snapshot.

raftpb.Snapshot has been split into Snapshot and SnapshotMetadata, to
allow the full snapshot data to be read from disk only when needed.

raft.Storage has new methods Snapshot, ApplySnapshot, HardState, and
SetHardState. The Snapshot and HardState parameters have been removed
from RestartNode() and will now be loaded from Storage instead.
The only remaining difference between StartNode and RestartNode is that
the former bootstraps an initial list of Peers.
2014-11-19 16:40:26 -05:00
b50f331558 Merge pull request #1744 from xiang90/next
raft: optimistically increase the next if the follower is already matched
2014-11-19 13:21:11 -08:00
4c1fd07311 raft: optimistically increase the next if the follower is already matched
This is useful since we want to pipeline the appendEntry requests. Without
enabling optimistic increasing, the second pipelining appendEntry request
will include the entries the first one has already sent out. We decrease
the next directly to match if the leader receives a rejection for a matched
follower. This happens if one pipelining request get lost and following ones
arrives at the follower.
2014-11-18 13:41:38 -08:00