Compare commits
28 Commits
Author | SHA1 | Date | |
---|---|---|---|
99dcc8c322 | |||
3d2523e7e0 | |||
25e69d9659 | |||
707174b56a | |||
ce92cc3dc5 | |||
5bfbf3a48c | |||
e04a188358 | |||
a51fda3e5e | |||
ca44801650 | |||
2387ef3f21 | |||
d5bfca9465 | |||
7cb126967c | |||
444e017c05 | |||
356675b70f | |||
d7768635fd | |||
37796ed84c | |||
f007cf321d | |||
ca29691543 | |||
4bebb538eb | |||
c27db1ec5e | |||
a5fc1d214d | |||
1df0b941d7 | |||
3a71eb9d72 | |||
001cceb1cd | |||
98ff4af7f2 | |||
db4c5e0eaa | |||
b3c5ed60bd | |||
22c944d8ef |
@ -1,3 +1,12 @@
|
||||
v0.4.2
|
||||
* Improvements to the clustering documents
|
||||
* Set content-type properly on errors (#469)
|
||||
* Standbys re-join if they should be part of the cluster (#810, #815, #818)
|
||||
|
||||
v0.4.1
|
||||
* Re-introduce DELETE on the machines endpoint
|
||||
* Document the machines endpoint
|
||||
|
||||
v0.4.0
|
||||
* Introduced standby mode
|
||||
* Added HEAD requests
|
||||
|
@ -53,3 +53,9 @@ The Discovery API submits the `-peer-addr` of each etcd instance to the configur
|
||||
The discovery API will automatically clean up the address of a stale peer that is no longer part of the cluster. The TTL for this process is a week, which should be long enough to handle any extremely long outage you may encounter. There is no harm in having stale peers in the list until they are cleaned up, since an etcd instance only needs to connect to one valid peer in the cluster to join.
|
||||
|
||||
[discovery-design]: https://github.com/coreos/etcd/blob/master/Documentation/design/cluster-finding.md
|
||||
|
||||
## Lifetime of a Discovery URL
|
||||
|
||||
A discovery URL identifies a single etcd cluster. Do not re-use discovery URLs for new clusters.
|
||||
|
||||
When a machine starts with a new discovery URL the discovery URL will be activated and record the machine's metadata. If you destroy the whole cluster and attempt to bring the cluster back up with the same discovery URL it will fail. This is intentional because all of the registered machines are gone including their logs so there is nothing to recover the killed cluster.
|
||||
|
@ -107,7 +107,7 @@ curl -L http://127.0.0.1:4001/v2/keys/foo -XPUT -d value=bar
|
||||
|
||||
If one machine disconnects from the cluster, it could rejoin the cluster automatically when the communication is recovered.
|
||||
|
||||
If one machine is killed, it could rejoin the cluster when started with old name. If the peer address is changed, etcd will treat the new peer address as the refreshed one, which benefits instance migration, or virtual machine boot with different IP.
|
||||
If one machine is killed, it could rejoin the cluster when started with old name. If the peer address is changed, etcd will treat the new peer address as the refreshed one, which benefits instance migration, or virtual machine boot with different IP. The peer-address-changing functionality is only supported when the majority of the cluster is alive, because this behavior needs the consensus of the etcd cluster.
|
||||
|
||||
**Note:** For now, it is user responsibility to ensure that the machine doesn't join the cluster that has the member with the same name. Or unexpected error will happen. It would be improved sooner or later.
|
||||
|
||||
@ -167,15 +167,3 @@ Etcd can also do internal server-to-server communication using SSL client certs.
|
||||
To do this just change the `-*-file` flags to `-peer-*-file`.
|
||||
|
||||
If you are using SSL for server-to-server communication, you must use it on all instances of etcd.
|
||||
|
||||
|
||||
### What size cluster should I use?
|
||||
|
||||
Every command the client sends to the master is broadcast to all of the followers.
|
||||
The command is not committed until the majority of the cluster peers receive that command.
|
||||
|
||||
Because of this majority voting property, the ideal cluster should be kept small to keep speed up and be made up of an odd number of peers.
|
||||
|
||||
Odd numbers are good because if you have 8 peers the majority will be 5 and if you have 9 peers the majority will still be 5.
|
||||
The result is that an 8 peer cluster can tolerate 3 peer failures and a 9 peer cluster can tolerate 4 machine failures.
|
||||
And in the best case when all 9 peers are responding the cluster will perform at the speed of the fastest 5 machines.
|
||||
|
@ -20,6 +20,7 @@
|
||||
- [transitorykris/etcd-py](https://github.com/transitorykris/etcd-py)
|
||||
- [jplana/python-etcd](https://github.com/jplana/python-etcd) - Supports v2
|
||||
- [russellhaering/txetcd](https://github.com/russellhaering/txetcd) - a Twisted Python library
|
||||
- [cholcombe973/autodock](https://github.com/cholcombe973/autodock) - A docker deployment automation tool
|
||||
|
||||
**Node libraries**
|
||||
|
||||
|
@ -1,30 +1,38 @@
|
||||
# Optimal etcd Cluster Size
|
||||
|
||||
etcd's Raft consensus algorithm is most efficient in small clusters between 3 and 9 peers. Let's briefly explore how etcd works internally to understand why.
|
||||
|
||||
## Writing to etcd
|
||||
|
||||
Writes to an etcd peer are always redirected to the leader of the cluster and distributed to all of the peers immediately. A write is only considered successful when a majority of the peers acknowledge the write.
|
||||
|
||||
For example, in a 5 node cluster, a write operation is only as fast as the 3rd fastest machine. This is the main reason for keeping your etcd cluster below 9 nodes. In practice, you only need to worry about write performance in high latency environments such as a cluster spanning multiple data centers.
|
||||
|
||||
## Leader Election
|
||||
|
||||
The leader election process is similar to writing a key — a majority of the cluster must acknowledge the new leader before cluster operations can continue. The longer each node takes to elect a new leader means you have to wait longer before you can write to the cluster again. In low latency environments this process takes milliseconds.
|
||||
|
||||
## Odd Cluster Size
|
||||
|
||||
The other important cluster optimization is to always have an odd cluster size. Adding an odd node to the cluster doesn't change the size of the majority and therefore doesn't increase the total latency of the majority as described above. But you do gain a higher tolerance for peer failure by adding the extra machine. You can see this in practice when comparing two even and odd sized clusters:
|
||||
|
||||
| Cluster Size | Majority | Failure Tolerance |
|
||||
|--------------|------------|-------------------|
|
||||
| 8 machines | 5 machines | 3 machines |
|
||||
| 9 machines | 5 machines | **4 machines** |
|
||||
|
||||
As you can see, adding another node to bring the cluster up to an odd size is always worth it. During a network partition, an odd cluster size also guarantees that there will almost always be a majority of the cluster that can continue to operate and be the source of truth when the partition ends.
|
||||
etcd's Raft consensus algorithm is most efficient in small clusters between 3 and 9 peers. For clusters larger than 9, etcd will select a subset of instances to participate in the algorithm in order to keep it efficient. The end of this document briefly explores how etcd works internally and why these choices have been made.
|
||||
|
||||
## Cluster Management
|
||||
|
||||
Currently, each CoreOS machine is an etcd peer — if you have 30 CoreOS machines, you have 30 etcd peers and end up with a cluster size that is way too large. If desired, you may manually stop some of these etcd instances to increase cluster performance.
|
||||
You can manage the active cluster size through the [cluster config API](https://github.com/coreos/etcd/blob/master/Documentation/api.md#cluster-config). `activeSize` represents the etcd peers allowed to actively participate in the consensus algorithm.
|
||||
|
||||
Functionality is being developed to expose two different types of followers: active and benched followers. Active followers will influence operations within the cluster. Benched followers will not participate, but will transparently proxy etcd traffic to an active follower. This allows every CoreOS machine to expose etcd on port 4001 for ease of use. Benched followers will have the ability to transition into an active follower if needed.
|
||||
If the total number of etcd instances exceeds this number, additional peers are started as [standbys](https://github.com/coreos/etcd/blob/master/Documentation/design/standbys.md), which can be promoted to active participation if one of the existing active instances has failed or been removed.
|
||||
|
||||
## Internals of etcd
|
||||
|
||||
### Writing to etcd
|
||||
|
||||
Writes to an etcd peer are always redirected to the leader of the cluster and distributed to all of the peers immediately. A write is only considered successful when a majority of the peers acknowledge the write.
|
||||
|
||||
For example, in a cluster with 5 peers, a write operation is only as fast as the 3rd fastest machine. This is the main reason for keeping the number of active peers below 9. In practice, you only need to worry about write performance in high latency environments such as a cluster spanning multiple data centers.
|
||||
|
||||
### Leader Election
|
||||
|
||||
The leader election process is similar to writing a key — a majority of the active peers must acknowledge the new leader before cluster operations can continue. The longer each peer takes to elect a new leader means you have to wait longer before you can write to the cluster again. In low latency environments this process takes milliseconds.
|
||||
|
||||
### Odd Active Cluster Size
|
||||
|
||||
The other important cluster optimization is to always have an odd active cluster size (i.e. `activeSize`). Adding an odd node to the number of peers doesn't change the size of the majority and therefore doesn't increase the total latency of the majority as described above. But, you gain a higher tolerance for peer failure by adding the extra machine. You can see this in practice when comparing two even and odd sized clusters:
|
||||
|
||||
| Active Peers | Majority | Failure Tolerance |
|
||||
|--------------|------------|-------------------|
|
||||
| 1 peers | 1 peers | None |
|
||||
| 3 peers | 2 peers | 1 peer |
|
||||
| 4 peers | 3 peers | 2 peers |
|
||||
| 5 peers | 3 peers | **3 peers** |
|
||||
| 6 peers | 4 peers | 2 peers |
|
||||
| 7 peers | 4 peers | **3 peers** |
|
||||
| 8 peers | 5 peers | 3 peers |
|
||||
| 9 peers | 5 peers | **4 peers** |
|
||||
|
||||
As you can see, adding another peer to bring the number of active peers up to an odd size is always worth it. During a network partition, an odd number of active peers also guarantees that there will almost always be a majority of the cluster that can continue to operate and be the source of truth when the partition ends.
|
@ -1,6 +1,6 @@
|
||||
# etcd
|
||||
|
||||
README version 0.4.0
|
||||
README version 0.4.2
|
||||
|
||||
A highly-available key value store for shared configuration and service discovery.
|
||||
etcd is inspired by [Apache ZooKeeper][zookeeper] and [doozer][doozer], with a focus on being:
|
||||
|
@ -143,5 +143,6 @@ func (e Error) Write(w http.ResponseWriter) {
|
||||
status = http.StatusInternalServerError
|
||||
}
|
||||
}
|
||||
http.Error(w, e.toJsonString(), status)
|
||||
w.WriteHeader(status)
|
||||
fmt.Fprintln(w, e.toJsonString())
|
||||
}
|
||||
|
@ -232,6 +232,7 @@ func (e *Etcd) Run() {
|
||||
DataDir: e.Config.DataDir,
|
||||
}
|
||||
e.StandbyServer = server.NewStandbyServer(ssConfig, client)
|
||||
e.StandbyServer.SetRaftServer(raftServer)
|
||||
|
||||
// Generating config could be slow.
|
||||
// Put it here to make listen happen immediately after peer-server starting.
|
||||
@ -347,6 +348,7 @@ func (e *Etcd) runServer() {
|
||||
raftServer.SetElectionTimeout(electionTimeout)
|
||||
raftServer.SetHeartbeatInterval(heartbeatInterval)
|
||||
e.PeerServer.SetRaftServer(raftServer, e.Config.Snapshot)
|
||||
e.StandbyServer.SetRaftServer(raftServer)
|
||||
|
||||
e.PeerServer.SetJoinIndex(e.StandbyServer.JoinIndex())
|
||||
e.setMode(PeerMode)
|
||||
|
Binary file not shown.
@ -214,6 +214,7 @@ func (s *PeerServer) FindCluster(discoverURL string, peers []string) (toStart bo
|
||||
// TODO(yichengq): Think about the action that should be done
|
||||
// if it cannot connect any of the previous known node.
|
||||
log.Debugf("%s is restarting the cluster %v", name, possiblePeers)
|
||||
s.SetJoinIndex(s.raftServer.CommitIndex())
|
||||
toStart = true
|
||||
return
|
||||
}
|
||||
|
@ -1,3 +1,3 @@
|
||||
package server
|
||||
|
||||
const ReleaseVersion = "0.4.1"
|
||||
const ReleaseVersion = "0.4.2"
|
||||
|
@ -36,8 +36,9 @@ type standbyInfo struct {
|
||||
}
|
||||
|
||||
type StandbyServer struct {
|
||||
Config StandbyServerConfig
|
||||
client *Client
|
||||
Config StandbyServerConfig
|
||||
client *Client
|
||||
raftServer raft.Server
|
||||
|
||||
standbyInfo
|
||||
joinIndex uint64
|
||||
@ -62,6 +63,10 @@ func NewStandbyServer(config StandbyServerConfig, client *Client) *StandbyServer
|
||||
return s
|
||||
}
|
||||
|
||||
func (s *StandbyServer) SetRaftServer(raftServer raft.Server) {
|
||||
s.raftServer = raftServer
|
||||
}
|
||||
|
||||
func (s *StandbyServer) Start() {
|
||||
s.Lock()
|
||||
defer s.Unlock()
|
||||
@ -235,6 +240,13 @@ func (s *StandbyServer) syncCluster(peerURLs []string) error {
|
||||
}
|
||||
|
||||
func (s *StandbyServer) join(peer string) error {
|
||||
for _, url := range s.ClusterURLs() {
|
||||
if s.Config.PeerURL == url {
|
||||
s.joinIndex = s.raftServer.CommitIndex()
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
// Our version must match the leaders version
|
||||
version, err := s.client.GetVersion(peer)
|
||||
if err != nil {
|
||||
|
@ -24,12 +24,15 @@ func TestV2GetKey(t *testing.T) {
|
||||
v.Set("value", "XXX")
|
||||
fullURL := fmt.Sprintf("%s%s", s.URL(), "/v2/keys/foo/bar")
|
||||
resp, _ := tests.Get(fullURL)
|
||||
assert.Equal(t, resp.Header.Get("Content-Type"), "application/json")
|
||||
assert.Equal(t, resp.StatusCode, http.StatusNotFound)
|
||||
|
||||
resp, _ = tests.PutForm(fullURL, v)
|
||||
assert.Equal(t, resp.Header.Get("Content-Type"), "application/json")
|
||||
tests.ReadBody(resp)
|
||||
|
||||
resp, _ = tests.Get(fullURL)
|
||||
assert.Equal(t, resp.Header.Get("Content-Type"), "application/json")
|
||||
assert.Equal(t, resp.StatusCode, http.StatusOK)
|
||||
body := tests.ReadBodyJSON(resp)
|
||||
assert.Equal(t, body["action"], "get", "")
|
||||
|
@ -4,6 +4,7 @@ import (
|
||||
"bytes"
|
||||
"os"
|
||||
"strconv"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
@ -100,6 +101,8 @@ func TestTLSMultiNodeKillAllAndRecovery(t *testing.T) {
|
||||
t.Fatal("cannot create cluster")
|
||||
}
|
||||
|
||||
time.Sleep(time.Second)
|
||||
|
||||
c := etcd.NewClient(nil)
|
||||
|
||||
go Monitor(clusterSize, clusterSize, leaderChan, all, stop)
|
||||
@ -239,3 +242,74 @@ func TestMultiNodeKillAllAndRecoveryWithStandbys(t *testing.T) {
|
||||
assert.NoError(t, err)
|
||||
assert.Equal(t, len(result.Node.Nodes), 7)
|
||||
}
|
||||
|
||||
// Create a five nodes
|
||||
// Kill all the nodes and restart, then remove the leader
|
||||
func TestMultiNodeKillAllAndRecoveryAndRemoveLeader(t *testing.T) {
|
||||
procAttr := new(os.ProcAttr)
|
||||
procAttr.Files = []*os.File{nil, os.Stdout, os.Stderr}
|
||||
|
||||
stop := make(chan bool)
|
||||
leaderChan := make(chan string, 1)
|
||||
all := make(chan bool, 1)
|
||||
|
||||
clusterSize := 5
|
||||
argGroup, etcds, err := CreateCluster(clusterSize, procAttr, false)
|
||||
defer DestroyCluster(etcds)
|
||||
|
||||
if err != nil {
|
||||
t.Fatal("cannot create cluster")
|
||||
}
|
||||
|
||||
c := etcd.NewClient(nil)
|
||||
|
||||
go Monitor(clusterSize, clusterSize, leaderChan, all, stop)
|
||||
<-all
|
||||
<-leaderChan
|
||||
stop <- true
|
||||
|
||||
// It needs some time to sync current commits and write it to disk.
|
||||
// Or some instance may be restarted as a new peer, and we don't support
|
||||
// to connect back the old cluster that doesn't have majority alive
|
||||
// without log now.
|
||||
time.Sleep(time.Second)
|
||||
|
||||
c.SyncCluster()
|
||||
|
||||
// kill all
|
||||
DestroyCluster(etcds)
|
||||
|
||||
time.Sleep(time.Second)
|
||||
|
||||
stop = make(chan bool)
|
||||
leaderChan = make(chan string, 1)
|
||||
all = make(chan bool, 1)
|
||||
|
||||
time.Sleep(time.Second)
|
||||
|
||||
for i := 0; i < clusterSize; i++ {
|
||||
etcds[i], err = os.StartProcess(EtcdBinPath, argGroup[i], procAttr)
|
||||
}
|
||||
|
||||
go Monitor(clusterSize, 1, leaderChan, all, stop)
|
||||
|
||||
<-all
|
||||
leader := <-leaderChan
|
||||
|
||||
_, err = c.Set("foo", "bar", 0)
|
||||
if err != nil {
|
||||
t.Fatalf("Recovery error: %s", err)
|
||||
}
|
||||
|
||||
port, _ := strconv.Atoi(strings.Split(leader, ":")[2])
|
||||
num := port - 7000
|
||||
resp, _ := tests.Delete(leader+"/v2/admin/machines/node"+strconv.Itoa(num), "application/json", nil)
|
||||
if !assert.Equal(t, resp.StatusCode, 200) {
|
||||
t.FailNow()
|
||||
}
|
||||
|
||||
// check the old leader is in standby mode now
|
||||
time.Sleep(time.Second)
|
||||
resp, _ = tests.Get(leader + "/name")
|
||||
assert.Equal(t, resp.StatusCode, 404)
|
||||
}
|
||||
|
@ -31,7 +31,7 @@ func TestRemoveNode(t *testing.T) {
|
||||
|
||||
c.SyncCluster()
|
||||
|
||||
resp, _ := tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":4, "syncInterval":1}`))
|
||||
resp, _ := tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":4, "syncInterval":5}`))
|
||||
if !assert.Equal(t, resp.StatusCode, 200) {
|
||||
t.FailNow()
|
||||
}
|
||||
@ -41,11 +41,6 @@ func TestRemoveNode(t *testing.T) {
|
||||
client := &http.Client{}
|
||||
for i := 0; i < 2; i++ {
|
||||
for i := 0; i < 2; i++ {
|
||||
r, _ := tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":3}`))
|
||||
if !assert.Equal(t, r.StatusCode, 200) {
|
||||
t.FailNow()
|
||||
}
|
||||
|
||||
client.Do(rmReq)
|
||||
|
||||
fmt.Println("send remove to node3 and wait for its exiting")
|
||||
@ -76,12 +71,7 @@ func TestRemoveNode(t *testing.T) {
|
||||
panic(err)
|
||||
}
|
||||
|
||||
r, _ = tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":4}`))
|
||||
if !assert.Equal(t, r.StatusCode, 200) {
|
||||
t.FailNow()
|
||||
}
|
||||
|
||||
time.Sleep(time.Second + time.Second)
|
||||
time.Sleep(time.Second + 5*time.Second)
|
||||
|
||||
resp, err = c.Get("_etcd/machines", false, false)
|
||||
|
||||
@ -96,11 +86,6 @@ func TestRemoveNode(t *testing.T) {
|
||||
|
||||
// first kill the node, then remove it, then add it back
|
||||
for i := 0; i < 2; i++ {
|
||||
r, _ := tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":3}`))
|
||||
if !assert.Equal(t, r.StatusCode, 200) {
|
||||
t.FailNow()
|
||||
}
|
||||
|
||||
etcds[2].Kill()
|
||||
fmt.Println("kill node3 and wait for its exiting")
|
||||
etcds[2].Wait()
|
||||
@ -131,11 +116,6 @@ func TestRemoveNode(t *testing.T) {
|
||||
panic(err)
|
||||
}
|
||||
|
||||
r, _ = tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":4}`))
|
||||
if !assert.Equal(t, r.StatusCode, 200) {
|
||||
t.FailNow()
|
||||
}
|
||||
|
||||
time.Sleep(time.Second + time.Second)
|
||||
|
||||
resp, err = c.Get("_etcd/machines", false, false)
|
||||
@ -169,7 +149,8 @@ func TestRemovePausedNode(t *testing.T) {
|
||||
if !assert.Equal(t, r.StatusCode, 200) {
|
||||
t.FailNow()
|
||||
}
|
||||
time.Sleep(2 * time.Second)
|
||||
// Wait for standby instances to update its cluster config
|
||||
time.Sleep(6 * time.Second)
|
||||
|
||||
resp, err := c.Get("_etcd/machines", false, false)
|
||||
if err != nil {
|
||||
|
@ -89,7 +89,7 @@ func TestSnapshot(t *testing.T) {
|
||||
|
||||
index, _ = strconv.Atoi(snapshots[0].Name()[2:6])
|
||||
|
||||
if index < 1010 || index > 1025 {
|
||||
if index < 1010 || index > 1029 {
|
||||
t.Fatal("wrong name of snapshot :", snapshots[0].Name())
|
||||
}
|
||||
}
|
||||
|
@ -8,6 +8,7 @@ import (
|
||||
"time"
|
||||
|
||||
"github.com/coreos/etcd/server"
|
||||
"github.com/coreos/etcd/store"
|
||||
"github.com/coreos/etcd/tests"
|
||||
"github.com/coreos/etcd/third_party/github.com/coreos/go-etcd/etcd"
|
||||
"github.com/coreos/etcd/third_party/github.com/stretchr/testify/assert"
|
||||
@ -279,3 +280,61 @@ func TestStandbyDramaticChange(t *testing.T) {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestStandbyJoinMiss(t *testing.T) {
|
||||
clusterSize := 2
|
||||
_, etcds, err := CreateCluster(clusterSize, &os.ProcAttr{Files: []*os.File{nil, os.Stdout, os.Stderr}}, false)
|
||||
if err != nil {
|
||||
t.Fatal("cannot create cluster")
|
||||
}
|
||||
defer DestroyCluster(etcds)
|
||||
|
||||
c := etcd.NewClient(nil)
|
||||
c.SyncCluster()
|
||||
|
||||
time.Sleep(1 * time.Second)
|
||||
|
||||
// Verify that we have two machines.
|
||||
result, err := c.Get("_etcd/machines", false, true)
|
||||
assert.NoError(t, err)
|
||||
assert.Equal(t, len(result.Node.Nodes), clusterSize)
|
||||
|
||||
resp, _ := tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"removeDelay":4, "syncInterval":4}`))
|
||||
if !assert.Equal(t, resp.StatusCode, 200) {
|
||||
t.FailNow()
|
||||
}
|
||||
time.Sleep(time.Second)
|
||||
|
||||
resp, _ = tests.Delete("http://localhost:7001/v2/admin/machines/node2", "application/json", nil)
|
||||
if !assert.Equal(t, resp.StatusCode, 200) {
|
||||
t.FailNow()
|
||||
}
|
||||
|
||||
// Wait for a monitor cycle before checking for removal.
|
||||
time.Sleep(server.ActiveMonitorTimeout + (1 * time.Second))
|
||||
|
||||
// Verify that we now have one peer.
|
||||
result, err = c.Get("_etcd/machines", false, true)
|
||||
assert.NoError(t, err)
|
||||
assert.Equal(t, len(result.Node.Nodes), 1)
|
||||
|
||||
// Simulate the join failure
|
||||
_, err = server.NewClient(nil).AddMachine("http://localhost:7001",
|
||||
&server.JoinCommand{
|
||||
MinVersion: store.MinVersion(),
|
||||
MaxVersion: store.MaxVersion(),
|
||||
Name: "node2",
|
||||
RaftURL: "http://127.0.0.1:7002",
|
||||
EtcdURL: "http://127.0.0.1:4002",
|
||||
})
|
||||
assert.NoError(t, err)
|
||||
|
||||
time.Sleep(6 * time.Second)
|
||||
|
||||
go tests.Delete("http://localhost:7001/v2/admin/machines/node2", "application/json", nil)
|
||||
|
||||
time.Sleep(time.Second)
|
||||
result, err = c.Get("_etcd/machines", false, true)
|
||||
assert.NoError(t, err)
|
||||
assert.Equal(t, len(result.Node.Nodes), 1)
|
||||
}
|
||||
|
Reference in New Issue
Block a user