62560f9959
fix(server): add user facing remove API
...
This was accidently removed as we refactored the standy stuff. Re-add this
user facing remove endpoint that matches the config endpoints.
2014-05-20 20:01:10 -07:00
1e7a7b11dd
Merge pull request #799 from xiangli-cmu/deny_unknow_peer
...
hack(server): notify removed peers when they try to become candidates
2014-05-20 13:37:14 -07:00
934c28d498
fix(peer_server): set store and registry when setting raft server
...
New raft server needs new store and registry.
2014-05-20 13:12:12 -07:00
189fece683
hack(server): notify removed peers when they try to become candidates
...
A peer might be removed during a network partiton. When it comes back it
will not have received any of the log entries that would have notified
it of its removal and go onto propose a vote. This will disrupt the
cluster and the cluster should give the machine feedback that it is no
longer a member.
The term of a denied vote is MaxUint64. The notification of the removal
is a raft event. These two modification are quick heck.
In reaction to this notification the machine should shutdown. In this
case the shutdown just moves it towards becoming a standby server.
2014-05-20 10:17:32 -07:00
c0027bfc78
feat(cluster_config): change field from int to float64
...
This is modified for better flexibility, especially for testing.
2014-05-12 22:42:18 -04:00
5367c1c998
chore(standby): minor changes based on comments
2014-05-09 15:38:03 -07:00
c6b1a738c3
feat(option): add cluster config option
...
It will be used when creating a brand-new cluster.
2014-05-09 15:22:11 -07:00
6d4f018887
chore(cluster_config): rename SyncClusterInterval to SyncInterval
...
for better naming
2014-05-09 13:28:21 -07:00
765cd5d8b3
refactor(find_cluster): make it simpler
2014-05-09 02:27:04 -07:00
baadf63912
feat: implement standby mode
...
Change log:
1. PeerServer
- estimate initial mode from its log through removedInLog variable
- refactor FindCluster to return the estimation
- refactor Start to call FindCluster explicitly
- move raftServer start and cluster init from FindCluster to Start
- remove stopNotify from PeerServer because it is not used anymore
2. Etcd
- refactor Run logic to fit the specification
3. ClusterConfig
- rename promoteDelay to removeDelay for better naming
- add SyncClusterInterval field to ClusterConfig
- commit command to set default cluster config when cluster is created
- store cluster config info into key space for consistency
- reload cluster config when reboot
4. add StandbyServer
5. Error
- remove unused EcodePromoteError
2014-05-09 01:56:55 -07:00
f1c13e2d9d
Merge pull request #774 from unihorn/83
...
feat(join): check cluster conditions before join
2014-05-08 14:08:38 -07:00
6c950eaf97
Merge pull request #772 from unihorn/81
...
feat(peer_server): stop service when removed
2014-05-08 14:02:09 -07:00
5c7a963cf0
chore(peer_server): adjust code to make it more clear
2014-05-08 13:20:46 -07:00
c92231c91a
Merge branch 'master' of github.com:coreos/etcd
...
Conflicts:
server/peer_server_handlers.go
2014-05-08 13:17:51 -07:00
e960a0e03c
chore(client): minor changes based on comments
...
The changes are made on error handling, comments and constant.
2014-05-08 13:15:10 -07:00
0558b546ff
fix(registry): fetch peers from store instead of cache
...
The current cache implmentation may contain removed machines, so we
fetch peers from store for correctness.
2014-05-08 08:44:32 -07:00
5465201292
chore(peer_server): more explanation for asyncRemove
2014-05-07 16:31:17 -07:00
c9ce14c857
chore(peer_server): set client transporter separately
...
It also moves the hack on timeout from raft transporter to
client transporter.
2014-05-07 13:26:05 -07:00
bed20b7837
chore(peer_server): add more function description
2014-05-07 12:51:41 -07:00
206881bfec
fix(peer_server): check running status before start/stop
...
This makes peer server more robust.
2014-05-07 12:44:48 -07:00
001b1fcd46
feat(join): check cluster conditions before join
2014-05-07 11:46:21 -07:00
4e14604e5c
refactor(server): add Client struct
...
This is used to send request to web API.
It will do this behavior a lot in standby mode, so I abstract this
struct first.
2014-05-07 11:46:15 -07:00
ba36a16bc5
feat(peer_server): stop service when removed
...
It doesn't modify the exit logic, but makes external code know
when removal happens and be able to determine what it should do.
2014-05-07 10:00:27 -07:00
997e7d3bf4
Merge pull request #771 from unihorn/80
...
refactor(peer_server): remove standby mode in peer server
2014-05-07 09:57:02 -07:00
17e299995c
refactor(peer_server): remove standby mode in peer server
2014-05-07 09:10:09 -07:00
d78116c35b
Merge pull request #675 from unihorn/56
...
fix(peer_server): exit all server goroutines in Stop()
2014-05-07 08:09:14 -07:00
6516cf854c
chore(server): rename daemon to startRoutine
...
For better understanding.
2014-05-07 07:51:44 -07:00
e55512f60b
fix(peer_server): graceful stop for peer server run
...
Peer server will be started and stopped repeatedly in the design.
This step ensures its stop doesn't affect the next start.
The patch includes goroutine stop and timer trigger remove.
2014-05-07 07:43:27 -07:00
0c95e1eabb
feat(peer_server): forbid rejoining with different name
...
Or it will confuse the cluster, especially the heartbeat between nodes.
2014-04-17 15:46:33 -07:00
82dee82bfd
chore: gofmt go files
2014-04-17 08:47:48 -07:00
67600603c5
chore: rename proxy mode to standby mode
...
It makes the name more reasonable.
2014-04-17 08:04:42 -07:00
65b872c8b5
Merge pull request #725 from dougm/server-lifecycle-fixes
...
fix(server): avoid race conditions in Run/Stop
2014-04-15 11:54:35 -07:00
adf4acf947
chore: gofmt go files
2014-04-15 09:42:25 -07:00
d73390a674
fix(server): avoid race conditions in Run/Stop
...
- don't close ready channel until PeerServer is listening.
avoids possible panic in Stop() if PeerServer is nil.
- avoid data race in Run() (err variable was shared between 2 goroutines)
- avoid data race in PeerServer Start/Stop (PeerServer.closeChan)
2014-04-15 09:24:54 -07:00
8bcfb2ecaf
Merge pull request #707 from unihorn/62
...
fix(peer_server): recover from outage with discovery
2014-04-14 13:58:43 -07:00
03839ca806
fix(peer_server): recover from outage with discovery
...
This patch also contains the refactor of find cluster process.
It is changed based on @xiangli-cmu 's commits in 627 issue.
2014-04-14 13:56:47 -07:00
0b790abd46
Merge pull request #705 from unihorn/61
...
feat: set NOCOW for log directory when in btrfs
2014-04-14 16:40:38 -04:00
4fd9e627c0
fix(peer join) fix wrong join command redirection
...
1. We use PUT request to do a V2 join. So we should redirect a PUT request rather than a POST.
2. /admin only accept V2Join request. Send out V2Join instead of V1Join.
2014-04-13 21:33:02 -04:00
56ef6fbcae
make necessary changes
2014-04-11 17:00:14 -07:00
79a89dcb82
Revert "Revert "fix(server): only set NOCOW for log file""
...
This reverts commit 9540575690
.
Conflicts:
etcd/etcd.go
2014-04-11 16:33:50 -07:00
9540575690
Revert "fix(server): only set NOCOW for log file"
...
This reverts commit 1eff547af6
.
2014-04-09 14:39:16 -07:00
1eff547af6
fix(server): only set NOCOW for log file
2014-04-09 12:35:32 -07:00
a3cbf02597
fix(tests): pass all tests using latest raft
2014-03-24 17:35:45 -07:00
62b89a128a
Merge branch 'master' of https://github.com/coreos/etcd into proxy
...
Conflicts:
config/config.go
server/peer_server.go
server/transporter.go
tests/server_utils.go
2014-03-24 15:30:14 -07:00
174b9ff343
bump(github.com/goraft/raft): 6bf34b9
...
Move from coreos/raft to goraft/raft and update to latest.
2014-03-24 15:09:47 -07:00
7d4fda550d
Machine join/remove v2 API.
2014-03-18 16:25:21 -06:00
c0a59b3a27
Add minimum active size and promote delay.
2014-03-10 14:44:04 -06:00
c91688315a
Minor fixes to proxies.
2014-03-07 07:38:40 -07:00
3fff1a8dcd
Add /machines and /machines/:name endpoints.
2014-03-06 15:11:31 -07:00
fe4dee03ab
Minor fixes.
2014-03-04 09:29:44 -07:00