chore(server): bump back to 0.4.2

Merge pull request #825 from unihorn/98
fix(multi_node_kill_all_and_recovery_test): ensure cluster is up
2014-06-02 15:25:03 -07:00 · 2014-06-02 15:12:05 -07:00 · 2014-06-02 14:43:51 -07:00 · 2014-06-02 14:19:52 -07:00 · 2014-06-02 14:17:38 -07:00 · 2014-06-02 14:12:08 -07:00
17 changed files with 211 additions and 66 deletions
--- a/9
+++ b/9
@ -1,3 +1,12 @@
+v0.4.2
+* Improvements to the clustering documents
+* Set content-type properly on errors (#469)
+* Standbys re-join if they should be part of the cluster (#810, #815, #818)
+
+v0.4.1
+* Re-introduce DELETE on the machines endpoint
+* Document the machines endpoint
+
 v0.4.0
 * Introduced standby mode
 * Added HEAD requests
--- a/Documentation/cluster-discovery.md
+++ b/Documentation/cluster-discovery.md
@ -53,3 +53,9 @@ The Discovery API submits the `-peer-addr` of each etcd instance to the configur
 The discovery API will automatically clean up the address of a stale peer that is no longer part of the cluster. The TTL for this process is a week, which should be long enough to handle any extremely long outage you may encounter. There is no harm in having stale peers in the list until they are cleaned up, since an etcd instance only needs to connect to one valid peer in the cluster to join.

 [discovery-design]: https://github.com/coreos/etcd/blob/master/Documentation/design/cluster-finding.md
+
+## Lifetime of a Discovery URL
+
+A discovery URL identifies a single etcd cluster. Do not re-use discovery URLs for new clusters.
+
+When a machine starts with a new discovery URL the discovery URL will be activated and record the machine's metadata. If you destroy the whole cluster and attempt to bring the cluster back up with the same discovery URL it will fail. This is intentional because all of the registered machines are gone including their logs so there is nothing to recover the killed cluster.
--- a/Documentation/clustering.md
+++ b/Documentation/clustering.md
@ -107,7 +107,7 @@ curl -L http://127.0.0.1:4001/v2/keys/foo -XPUT -d value=bar

 If one machine disconnects from the cluster, it could rejoin the cluster automatically when the communication is recovered.

-If one machine is killed, it could rejoin the cluster when started with old name. If the peer address is changed, etcd will treat the new peer address as the refreshed one, which benefits instance migration, or virtual machine boot with different IP.
+If one machine is killed, it could rejoin the cluster when started with old name. If the peer address is changed, etcd will treat the new peer address as the refreshed one, which benefits instance migration, or virtual machine boot with different IP. The peer-address-changing functionality is only supported when the majority of the cluster is alive, because this behavior needs the consensus of the etcd cluster.

 **Note:** For now, it is user responsibility to ensure that the machine doesn't join the cluster that has the member with the same name. Or unexpected error will happen. It would be improved sooner or later.

@ -167,15 +167,3 @@ Etcd can also do internal server-to-server communication using SSL client certs.
 To do this just change the `-*-file` flags to `-peer-*-file`.

 If you are using SSL for server-to-server communication, you must use it on all instances of etcd.
-
-
-### What size cluster should I use?
-
-Every command the client sends to the master is broadcast to all of the followers.
-The command is not committed until the majority of the cluster peers receive that command.
-
-Because of this majority voting property, the ideal cluster should be kept small to keep speed up and be made up of an odd number of peers.
-
-Odd numbers are good because if you have 8 peers the majority will be 5 and if you have 9 peers the majority will still be 5.
-The result is that an 8 peer cluster can tolerate 3 peer failures and a 9 peer cluster can tolerate 4 machine failures.
-And in the best case when all 9 peers are responding the cluster will perform at the speed of the fastest 5 machines.
--- a/Documentation/libraries-and-tools.md
+++ b/Documentation/libraries-and-tools.md
@ -20,6 +20,7 @@
 - [transitorykris/etcd-py](https://github.com/transitorykris/etcd-py)
 - [jplana/python-etcd](https://github.com/jplana/python-etcd) - Supports v2
 - [russellhaering/txetcd](https://github.com/russellhaering/txetcd) - a Twisted Python library
+- [cholcombe973/autodock](https://github.com/cholcombe973/autodock) - A docker deployment automation tool

 **Node libraries**

--- a/Documentation/optimal-cluster-size.md
+++ b/Documentation/optimal-cluster-size.md
@ -1,30 +1,38 @@
 # Optimal etcd Cluster Size

-etcd's Raft consensus algorithm is most efficient in small clusters between 3 and 9 peers. Let's briefly explore how etcd works internally to understand why.
-
-## Writing to etcd
-
-Writes to an etcd peer are always redirected to the leader of the cluster and distributed to all of the peers immediately. A write is only considered successful when a majority of the peers acknowledge the write.
-
-For example, in a 5 node cluster, a write operation is only as fast as the 3rd fastest machine. This is the main reason for keeping your etcd cluster below 9 nodes. In practice, you only need to worry about write performance in high latency environments such as a cluster spanning multiple data centers.
-
-## Leader Election
-
-The leader election process is similar to writing a key &mdash; a majority of the cluster must acknowledge the new leader before cluster operations can continue. The longer each node takes to elect a new leader means you have to wait longer before you can write to the cluster again. In low latency environments this process takes milliseconds.
-
-## Odd Cluster Size
-
-The other important cluster optimization is to always have an odd cluster size. Adding an odd node to the cluster doesn't change the size of the majority and therefore doesn't increase the total latency of the majority as described above. But you do gain a higher tolerance for peer failure by adding the extra machine. You can see this in practice when comparing two even and odd sized clusters:
-
-| Cluster Size | Majority   | Failure Tolerance |
-|--------------|------------|-------------------|
-| 8 machines   | 5 machines | 3 machines        |
-| 9 machines   | 5 machines | **4 machines**    |
-
-As you can see, adding another node to bring the cluster up to an odd size is always worth it. During a network partition, an odd cluster size also guarantees that there will almost always be a majority of the cluster that can continue to operate and be the source of truth when the partition ends.
+etcd's Raft consensus algorithm is most efficient in small clusters between 3 and 9 peers. For clusters larger than 9, etcd will select a subset of instances to participate in the algorithm in order to keep it efficient. The end of this document briefly explores how etcd works internally and why these choices have been made.

 ## Cluster Management

-Currently, each CoreOS machine is an etcd peer &mdash; if you have 30 CoreOS machines, you have 30 etcd peers and end up with a cluster size that is way too large. If desired, you may manually stop some of these etcd instances to increase cluster performance.
+You can manage the active cluster size through the [cluster config API](https://github.com/coreos/etcd/blob/master/Documentation/api.md#cluster-config). `activeSize` represents the etcd peers allowed to actively participate in the consensus algorithm.

-Functionality is being developed to expose two different types of followers: active and benched followers. Active followers will influence operations within the cluster. Benched followers will not participate, but will transparently proxy etcd traffic to an active follower. This allows every CoreOS machine to expose etcd on port 4001 for ease of use. Benched followers will have the ability to transition into an active follower if needed.
+If the total number of etcd instances exceeds this number, additional peers are started as [standbys](https://github.com/coreos/etcd/blob/master/Documentation/design/standbys.md), which can be promoted to active participation if one of the existing active instances has failed or been removed.
+
+## Internals of etcd
+
+### Writing to etcd
+
+Writes to an etcd peer are always redirected to the leader of the cluster and distributed to all of the peers immediately. A write is only considered successful when a majority of the peers acknowledge the write.
+
+For example, in a cluster with 5 peers, a write operation is only as fast as the 3rd fastest machine. This is the main reason for keeping the number of active peers below 9. In practice, you only need to worry about write performance in high latency environments such as a cluster spanning multiple data centers.
+
+### Leader Election
+
+The leader election process is similar to writing a key &mdash; a majority of the active peers must acknowledge the new leader before cluster operations can continue. The longer each peer takes to elect a new leader means you have to wait longer before you can write to the cluster again. In low latency environments this process takes milliseconds.
+
+### Odd Active Cluster Size
+
+The other important cluster optimization is to always have an odd active cluster size (i.e. `activeSize`). Adding an odd node to the number of peers doesn't change the size of the majority and therefore doesn't increase the total latency of the majority as described above. But, you gain a higher tolerance for peer failure by adding the extra machine. You can see this in practice when comparing two even and odd sized clusters:
+
+| Active Peers | Majority   | Failure Tolerance |
+|--------------|------------|-------------------|
+| 1 peers      | 1 peers    | None              |
+| 3 peers      | 2 peers    | 1 peer            |
+| 4 peers      | 3 peers    | 2 peers           |
+| 5 peers      | 3 peers    | **3 peers**       |
+| 6 peers      | 4 peers    | 2 peers           |
+| 7 peers      | 4 peers    | **3 peers**       |
+| 8 peers      | 5 peers    | 3 peers           |
+| 9 peers      | 5 peers    | **4 peers**       |
+
+As you can see, adding another peer to bring the number of active peers up to an odd size is always worth it. During a network partition, an odd number of active peers also guarantees that there will almost always be a majority of the cluster that can continue to operate and be the source of truth when the partition ends.
--- a/README.md
+++ b/README.md
@ -1,6 +1,6 @@
 # etcd

-README version 0.4.0
+README version 0.4.2

 A highly-available key value store for shared configuration and service discovery.
 etcd is inspired by [Apache ZooKeeper][zookeeper] and [doozer][doozer], with a focus on being:
--- a/error/error.go
+++ b/error/error.go
@ -143,5 +143,6 @@ func (e Error) Write(w http.ResponseWriter) {
 			status = http.StatusInternalServerError
 		}
 	}
-	http.Error(w, e.toJsonString(), status)
+	w.WriteHeader(status)
+	fmt.Fprintln(w, e.toJsonString())
 }
--- a/etcd/etcd.go
+++ b/etcd/etcd.go
@ -232,6 +232,7 @@ func (e *Etcd) Run() {
 		DataDir:    e.Config.DataDir,
 	}
 	e.StandbyServer = server.NewStandbyServer(ssConfig, client)
+	e.StandbyServer.SetRaftServer(raftServer)

 	// Generating config could be slow.
 	// Put it here to make listen happen immediately after peer-server starting.
@ -347,6 +348,7 @@ func (e *Etcd) runServer() {
 			raftServer.SetElectionTimeout(electionTimeout)
 			raftServer.SetHeartbeatInterval(heartbeatInterval)
 			e.PeerServer.SetRaftServer(raftServer, e.Config.Snapshot)
+			e.StandbyServer.SetRaftServer(raftServer)

 			e.PeerServer.SetJoinIndex(e.StandbyServer.JoinIndex())
 			e.setMode(PeerMode)
--- a/pkg/btrfs/.util_linux.go.swp
+++ b/pkg/btrfs/.util_linux.go.swp
--- a/server/peer_server.go
+++ b/server/peer_server.go
@ -214,6 +214,7 @@ func (s *PeerServer) FindCluster(discoverURL string, peers []string) (toStart bo
 		// TODO(yichengq): Think about the action that should be done
 		// if it cannot connect any of the previous known node.
 		log.Debugf("%s is restarting the cluster %v", name, possiblePeers)
+		s.SetJoinIndex(s.raftServer.CommitIndex())
 		toStart = true
 		return
 	}
--- a/server/release_version.go
+++ b/server/release_version.go
@ -1,3 +1,3 @@
 package server

-const ReleaseVersion = "0.4.1"
+const ReleaseVersion = "0.4.2"
--- a/server/standby_server.go
+++ b/server/standby_server.go
@ -36,8 +36,9 @@ type standbyInfo struct {
 }

 type StandbyServer struct {
-	Config StandbyServerConfig
-	client *Client
+	Config     StandbyServerConfig
+	client     *Client
+	raftServer raft.Server

 	standbyInfo
 	joinIndex uint64
@ -62,6 +63,10 @@ func NewStandbyServer(config StandbyServerConfig, client *Client) *StandbyServer
 	return s
 }

+func (s *StandbyServer) SetRaftServer(raftServer raft.Server) {
+	s.raftServer = raftServer
+}
+
 func (s *StandbyServer) Start() {
 	s.Lock()
 	defer s.Unlock()
@ -235,6 +240,13 @@ func (s *StandbyServer) syncCluster(peerURLs []string) error {
 }

 func (s *StandbyServer) join(peer string) error {
+	for _, url := range s.ClusterURLs() {
+		if s.Config.PeerURL == url {
+			s.joinIndex = s.raftServer.CommitIndex()
+			return nil
+		}
+	}
+
 	// Our version must match the leaders version
 	version, err := s.client.GetVersion(peer)
 	if err != nil {
--- a/server/v2/tests/get_handler_test.go
+++ b/server/v2/tests/get_handler_test.go
@ -24,12 +24,15 @@ func TestV2GetKey(t *testing.T) {
 		v.Set("value", "XXX")
 		fullURL := fmt.Sprintf("%s%s", s.URL(), "/v2/keys/foo/bar")
 		resp, _ := tests.Get(fullURL)
+		assert.Equal(t, resp.Header.Get("Content-Type"), "application/json")
 		assert.Equal(t, resp.StatusCode, http.StatusNotFound)

 		resp, _ = tests.PutForm(fullURL, v)
+		assert.Equal(t, resp.Header.Get("Content-Type"), "application/json")
 		tests.ReadBody(resp)

 		resp, _ = tests.Get(fullURL)
+		assert.Equal(t, resp.Header.Get("Content-Type"), "application/json")
 		assert.Equal(t, resp.StatusCode, http.StatusOK)
 		body := tests.ReadBodyJSON(resp)
 		assert.Equal(t, body["action"], "get", "")
--- a/tests/functional/multi_node_kill_all_and_recovery_test.go
+++ b/tests/functional/multi_node_kill_all_and_recovery_test.go
@ -4,6 +4,7 @@ import (
 	"bytes"
 	"os"
 	"strconv"
+	"strings"
 	"testing"
 	"time"

@ -100,6 +101,8 @@ func TestTLSMultiNodeKillAllAndRecovery(t *testing.T) {
 		t.Fatal("cannot create cluster")
 	}

+	time.Sleep(time.Second)
+
 	c := etcd.NewClient(nil)

 	go Monitor(clusterSize, clusterSize, leaderChan, all, stop)
@ -239,3 +242,74 @@ func TestMultiNodeKillAllAndRecoveryWithStandbys(t *testing.T) {
 	assert.NoError(t, err)
 	assert.Equal(t, len(result.Node.Nodes), 7)
 }
+
+// Create a five nodes
+// Kill all the nodes and restart, then remove the leader
+func TestMultiNodeKillAllAndRecoveryAndRemoveLeader(t *testing.T) {
+	procAttr := new(os.ProcAttr)
+	procAttr.Files = []*os.File{nil, os.Stdout, os.Stderr}
+
+	stop := make(chan bool)
+	leaderChan := make(chan string, 1)
+	all := make(chan bool, 1)
+
+	clusterSize := 5
+	argGroup, etcds, err := CreateCluster(clusterSize, procAttr, false)
+	defer DestroyCluster(etcds)
+
+	if err != nil {
+		t.Fatal("cannot create cluster")
+	}
+
+	c := etcd.NewClient(nil)
+
+	go Monitor(clusterSize, clusterSize, leaderChan, all, stop)
+	<-all
+	<-leaderChan
+	stop <- true
+
+	// It needs some time to sync current commits and write it to disk.
+	// Or some instance may be restarted as a new peer, and we don't support
+	// to connect back the old cluster that doesn't have majority alive
+	// without log now.
+	time.Sleep(time.Second)
+
+	c.SyncCluster()
+
+	// kill all
+	DestroyCluster(etcds)
+
+	time.Sleep(time.Second)
+
+	stop = make(chan bool)
+	leaderChan = make(chan string, 1)
+	all = make(chan bool, 1)
+
+	time.Sleep(time.Second)
+
+	for i := 0; i < clusterSize; i++ {
+		etcds[i], err = os.StartProcess(EtcdBinPath, argGroup[i], procAttr)
+	}
+
+	go Monitor(clusterSize, 1, leaderChan, all, stop)
+
+	<-all
+	leader := <-leaderChan
+
+	_, err = c.Set("foo", "bar", 0)
+	if err != nil {
+		t.Fatalf("Recovery error: %s", err)
+	}
+
+	port, _ := strconv.Atoi(strings.Split(leader, ":")[2])
+	num := port - 7000
+	resp, _ := tests.Delete(leader+"/v2/admin/machines/node"+strconv.Itoa(num), "application/json", nil)
+	if !assert.Equal(t, resp.StatusCode, 200) {
+		t.FailNow()
+	}
+
+	// check the old leader is in standby mode now
+	time.Sleep(time.Second)
+	resp, _ = tests.Get(leader + "/name")
+	assert.Equal(t, resp.StatusCode, 404)
+}
--- a/tests/functional/remove_node_test.go
+++ b/tests/functional/remove_node_test.go
@ -31,7 +31,7 @@ func TestRemoveNode(t *testing.T) {

 	c.SyncCluster()

-	resp, _ := tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":4, "syncInterval":1}`))
+	resp, _ := tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":4, "syncInterval":5}`))
 	if !assert.Equal(t, resp.StatusCode, 200) {
 		t.FailNow()
 	}
@ -41,11 +41,6 @@ func TestRemoveNode(t *testing.T) {
 	client := &http.Client{}
 	for i := 0; i < 2; i++ {
 		for i := 0; i < 2; i++ {
-			r, _ := tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":3}`))
-			if !assert.Equal(t, r.StatusCode, 200) {
-				t.FailNow()
-			}
-
 			client.Do(rmReq)

 			fmt.Println("send remove to node3 and wait for its exiting")
@ -76,12 +71,7 @@ func TestRemoveNode(t *testing.T) {
 				panic(err)
 			}

-			r, _ = tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":4}`))
-			if !assert.Equal(t, r.StatusCode, 200) {
-				t.FailNow()
-			}
-
-			time.Sleep(time.Second + time.Second)
+			time.Sleep(time.Second + 5*time.Second)

 			resp, err = c.Get("_etcd/machines", false, false)

@ -96,11 +86,6 @@ func TestRemoveNode(t *testing.T) {

 		// first kill the node, then remove it, then add it back
 		for i := 0; i < 2; i++ {
-			r, _ := tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":3}`))
-			if !assert.Equal(t, r.StatusCode, 200) {
-				t.FailNow()
-			}
-
 			etcds[2].Kill()
 			fmt.Println("kill node3 and wait for its exiting")
 			etcds[2].Wait()
@ -131,11 +116,6 @@ func TestRemoveNode(t *testing.T) {
 				panic(err)
 			}

-			r, _ = tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"activeSize":4}`))
-			if !assert.Equal(t, r.StatusCode, 200) {
-				t.FailNow()
-			}
-
 			time.Sleep(time.Second + time.Second)

 			resp, err = c.Get("_etcd/machines", false, false)
@ -169,7 +149,8 @@ func TestRemovePausedNode(t *testing.T) {
 	if !assert.Equal(t, r.StatusCode, 200) {
 		t.FailNow()
 	}
-	time.Sleep(2 * time.Second)
+	// Wait for standby instances to update its cluster config
+	time.Sleep(6 * time.Second)

 	resp, err := c.Get("_etcd/machines", false, false)
 	if err != nil {
--- a/tests/functional/simple_snapshot_test.go
+++ b/tests/functional/simple_snapshot_test.go
@ -89,7 +89,7 @@ func TestSnapshot(t *testing.T) {

 	index, _ = strconv.Atoi(snapshots[0].Name()[2:6])

-	if index < 1010 || index > 1025 {
+	if index < 1010 || index > 1029 {
 		t.Fatal("wrong name of snapshot :", snapshots[0].Name())
 	}
 }
--- a/tests/functional/standby_test.go
+++ b/tests/functional/standby_test.go
@ -8,6 +8,7 @@ import (
 	"time"

 	"github.com/coreos/etcd/server"
+	"github.com/coreos/etcd/store"
 	"github.com/coreos/etcd/tests"
 	"github.com/coreos/etcd/third_party/github.com/coreos/go-etcd/etcd"
 	"github.com/coreos/etcd/third_party/github.com/stretchr/testify/assert"
@ -279,3 +280,61 @@ func TestStandbyDramaticChange(t *testing.T) {
 		}
 	}
 }
+
+func TestStandbyJoinMiss(t *testing.T) {
+	clusterSize := 2
+	_, etcds, err := CreateCluster(clusterSize, &os.ProcAttr{Files: []*os.File{nil, os.Stdout, os.Stderr}}, false)
+	if err != nil {
+		t.Fatal("cannot create cluster")
+	}
+	defer DestroyCluster(etcds)
+
+	c := etcd.NewClient(nil)
+	c.SyncCluster()
+
+	time.Sleep(1 * time.Second)
+
+	// Verify that we have two machines.
+	result, err := c.Get("_etcd/machines", false, true)
+	assert.NoError(t, err)
+	assert.Equal(t, len(result.Node.Nodes), clusterSize)
+
+	resp, _ := tests.Put("http://localhost:7001/v2/admin/config", "application/json", bytes.NewBufferString(`{"removeDelay":4, "syncInterval":4}`))
+	if !assert.Equal(t, resp.StatusCode, 200) {
+		t.FailNow()
+	}
+	time.Sleep(time.Second)
+
+	resp, _ = tests.Delete("http://localhost:7001/v2/admin/machines/node2", "application/json", nil)
+	if !assert.Equal(t, resp.StatusCode, 200) {
+		t.FailNow()
+	}
+
+	// Wait for a monitor cycle before checking for removal.
+	time.Sleep(server.ActiveMonitorTimeout + (1 * time.Second))
+
+	// Verify that we now have one peer.
+	result, err = c.Get("_etcd/machines", false, true)
+	assert.NoError(t, err)
+	assert.Equal(t, len(result.Node.Nodes), 1)
+
+	// Simulate the join failure
+	_, err = server.NewClient(nil).AddMachine("http://localhost:7001",
+		&server.JoinCommand{
+			MinVersion: store.MinVersion(),
+			MaxVersion: store.MaxVersion(),
+			Name:       "node2",
+			RaftURL:    "http://127.0.0.1:7002",
+			EtcdURL:    "http://127.0.0.1:4002",
+		})
+	assert.NoError(t, err)
+
+	time.Sleep(6 * time.Second)
+
+	go tests.Delete("http://localhost:7001/v2/admin/machines/node2", "application/json", nil)
+
+	time.Sleep(time.Second)
+	result, err = c.Get("_etcd/machines", false, true)
+	assert.NoError(t, err)
+	assert.Equal(t, len(result.Node.Nodes), 1)
+}
Author	SHA1	Message	Date
Brandon Philips	99dcc8c322	chore(server): bump back to 0.4.2	2014-06-02 15:25:03 -07:00
Yicheng Qin	3d2523e7e0	Merge pull request #825 from unihorn/98 fix(multi_node_kill_all_and_recovery_test): ensure cluster is up	2014-06-02 15:12:05 -07:00
Yicheng Qin	25e69d9659	fix(multi_node_kill_all_and_recovery_test): ensure cluster is up	2014-06-02 14:43:51 -07:00
Brandon Philips	707174b56a	chore(server): bump to 0.4.2+git	2014-06-02 14:19:52 -07:00
Brandon Philips	ce92cc3dc5	feat(CHANGELOG): bump to v0.4.2	2014-06-02 14:17:38 -07:00
Yicheng Qin	5bfbf3a48c	Merge pull request #824 from unihorn/97 fix(remove_node_test): remove unnecessary cluster configuration	2014-06-02 14:12:08 -07:00
Yicheng Qin	e04a188358	fix(remove_node_test): remove unnecessary cluster configuration The cluster configuration operation is originally to make sure the instance won't be added back automatically between removal and check for the number of existing peer-mode instances. But this could make some node removed before the removal command. Use longer sync interval instead to avoid this problem.	2014-06-02 13:30:19 -07:00
Brandon Philips	a51fda3e5e	Merge pull request #822 from philips/add-notes-about-discovery docs(cluster-discovery): add caution to use old discovery endpoint	2014-06-02 12:06:00 -07:00
Yicheng Qin	ca44801650	docs(cluster-discovery): add caution to use old discovery endpoint	2014-06-02 11:34:56 -07:00
Yicheng Qin	2387ef3f21	Merge pull request #819 from unihorn/97 fix(server): joinIndex is not set after recovery from full outage	2014-06-02 11:04:07 -07:00
Yicheng Qin	d5bfca9465	Merge pull request #814 from unihorn/91 fix(server/v2): set correct content-type for etcdError response	2014-06-02 10:38:36 -07:00
Yicheng Qin	7cb126967c	fix(simple_snapshot_test): enlarge reasonable index range	2014-05-31 10:42:31 -07:00
Yicheng Qin	444e017c05	fix(remove_node_test): ensure cluster config is activated	2014-05-31 10:32:03 -07:00
Yicheng Qin	356675b70f	fix(multi_node_kill_all_and_recovery_test): ensure cluster running	2014-05-31 10:15:03 -07:00
Yicheng Qin	d7768635fd	fix(server): set joinIndex when recovered	2014-05-31 10:03:39 -07:00
Yicheng Qin	37796ed84c	tests: add TestMultiNodeKillAllAndRecorveryAndRemoveLeader This one breaks because it doesn't set joinIndex correctly.	2014-05-31 10:01:45 -07:00
Yicheng Qin	f007cf321d	Merge pull request #818 from unihorn/96 fix(standby_server): able to join the cluster containing itself	2014-05-30 18:36:58 -07:00
Yicheng Qin	ca29691543	tests(standby_test): comments	2014-05-30 18:36:23 -07:00
Yicheng Qin	4bebb538eb	fix(standby_server): able to join the cluster containing itself Standby server will switch to peer server if it finds that it has been contained in the cluster.	2014-05-30 14:03:49 -07:00
Brandon Philips	c27db1ec5e	Merge pull request #816 from unihorn/95 docs(clustering): limit for peer-address changing	2014-05-30 13:45:12 -07:00
Brandon Philips	a5fc1d214d	Merge pull request #817 from cholcombe973/master Adding autodock into the libraries and tools section	2014-05-30 13:41:32 -07:00
Chris Holcombe	1df0b941d7	Adding autodock into the libraries and tools section	2014-05-30 13:20:28 -07:00
Rob Szumski	3a71eb9d72	Merge pull request #808 from robszumski/update-optimal-size fix(docs): add information about standbys	2014-05-30 12:26:07 -07:00
Rob Szumski	001cceb1cd	fix(docs): update doc with standby info	2014-05-30 12:23:22 -07:00
Yicheng Qin	98ff4af7f2	docs(clustering): limit for peer-address changing	2014-05-30 08:50:16 -07:00
Yicheng Qin	db4c5e0eaa	fix(server/v2): set correct content-type for etcdError response "net/http".Error reset the content type, so we get rid of it and write our own one.	2014-05-29 14:18:50 -07:00
Brandon Philips	b3c5ed60bd	chore(pkg/btrfs): remove accidental swp file.	2014-05-22 09:50:40 -07:00
Brandon Philips	22c944d8ef	chore(server): bump 0.4.0+git	2014-05-20 20:55:57 -07:00