This function was fundamentally buggy, as a panic could be trivially
triggered by setting the wrong environment variable (e.g.
ETCD_BIND_ADDR=foo). Instead, let's propagate the error and present it
to the user in a cleaner way.
It is the correct thing to do to ensure that the communication is full
of out-of-date messages.
It results in that integration testing is very easy to throw MsgProp away,
and makes client wait until 5 min timeout. Sync interval and heartbeat are
increased to alleviate the traffic.
There's no real need to expose a Discoverer interface/struct when the
only use of the interface (and indeed the module) is to invoke a single
function. This isn't Java, after all. So instead, simplify to Discovery
exposing just two functions: JoinCluster (i.e. what was formerly called
"discovery"), and GetCluster (hitherto "ProxyDiscovery")
Node set the applied to committed right after it sends out Ready to application. This is not
correct since the application has not actually applied the entries at that point. We add a
Advance interface to Node. Application needs to call Advance to tell raft Node its progress.
Also this change can avoid unnecessary copying when application is still applying entires but
there are more entries to be applied.
1. etcd fatals if there is critical error in the system and operator should
do something for it
2. etcd panics if there happens something unexpected, and it should be
reported to us to debug.
When waiting for a watch result, we expect the good path to complete
quickly here so we don't need to time out so aggressively. (Failure
noted in #1600)
In certain cases (for example, if a cluster peer is accessible but it
has no members listed), the httpClusterClient could have an empty set of
endpoints as a result of the Sync. This means that its Do function could
potentially return a nil response and nil error, with catastrophic
consequences for callers.
To be safe (particularly about this latter behaviour), this change
errors in both Sync and Do if no endpoints are available.
The docs for the other APIs use curl for example usage, which matches
the docs for the etcd APIs.
Other cleanup include fixing usage of peer ports and using 10.0.0.x IPs
throughout.
- Provide more concrete examples and explanation
- Cleanup the formatting to one sentence per line, this makes reviewing
easier
- Point to existing docs on wal and snap instead of trying to duplicate
it here again.
This creates a simple ID type (wrapped around uint64) to provide for
standard serialization/deserialization to a string (i.e. base 16
encoded). This replaces strutil so now that package is removed.
There's no real need for do and doWithTimeout to return Responses when
the only field of interest is the status code.
This also removes the superfluous httpMembersAPIResponse struct.
If any of this initialization fails, something very bad has happened,
and we should not continue as-is (this has previously manifested in
strange bugs)
etcd checks that the data dir is writable by writing and removing an
empty file to the data dir during startup and exits non-zero if that
fails.
fixes#876
This is a simple solution to having the proxy keep up to date with the
state of the cluster. Basically, it uses the cluster configuration
provided at start up (i.e. with `-initial-cluster-state`) to determine
where to reach peer(s) in the cluster, and then it will periodically hit
the `/members` endpoint of those peer(s) (using the same mechanism that
`-cluster-state=existing` does to initialise) to update the set of valid
client URLs to proxy to.
This does not address discovery (#1376), and it would probably be better
to update the set of proxyURLs dynamically whenever we fetch the new
state of the cluster; but it needs a bit more thinking to have this done
in a clean way with the proxy interface.
Example in Procfile works again.
- Update all of the ports to use the new IANA port numbers
- Update the stats section to talk about the id fields
- Remove mention of the modules
- Remove -L from all of the curl commands since it is no longer needed
- Point people at the clustering.md guide for the cluster APIs
It returns error messaage like this now:
'{"errorCode":100,"message":"Key not found","cause":"/1/pants","index":10}'
The commit trims '/1' prefix from cause field if exists.
This is a hack to make it display well. It is correct because all error causes
that contain Path puts Path at the head of the string.
As explained in #1366, the leader will fail to transmit the missed
logs if the leader receives a hearbeat response from a follower
that is not yet matched in the leader. In other words, there are
append responses that do not explicitly reject an append but
implied a gap.
This commit is based on @xiangli-cmu's idea. We should only acknowledge
upto the index of logs in the append message. This way responses to
heartbeats would never interfer with the log synchronization because
their log index is always 0.
Fixes#1366
Continue using hex everywhere. Including here.
TODO: cleanup the printing of the structs which currently have decimal
to/from:
`{Type:MsgAppResp To:9973738105406047488 From:17050684879817348455 T...`
1) Don't panic since we know exactly where this is coming from and don't
need the user to see a full back trace
2) Add docs explaining this situation a bit further
3) Cleanup the error to look like other similiar errors
When adding new member, the etcdserver generates the ID based on the current time
and the given peerurls. We include time to add the uniqueness, since the node with
same peerurls should be able to (add, then remove) several times.
Removes the notion of name being anything more than advisory or
command-line grouping, and adds checks for bootstrapping the command
line. IDs are consistent if the URLs are consistent.
This adds the remaining two stats endpoints: `/v2/stats/self`, for
various statistics on the EtcdServer, and `/v2/stats/leader`, for
statistics on a leader's followers.
By and large most of the stats code is copied across from 0.4.x, updated
where necessary to integrate with the new decoupling of raft from
transport.
This does not satisfactorily resolve the question of name vs ID. In the
old world, names were unique in the cluster and transmitted over the
wire, so they could be used safely in all statistics. In the new world,
a given EtcdServer only knows its own name, and it is instead IDs that
are communicated among the cluster members. Hence in most places here we
simply substitute a string-encoded ID in place of name, and only where
possible do we retain the actual given name of the EtcdServer.
We use nodeID as the default dir previously. It works fine before we do dynamic nodeID
generation (introducing time). After the change the dynamic nodeID will change every
time we restart the etcd process. If the user does not provide the data dir, the default
dir will change every time. It is not the desired behavior.
In this commit, we change the default data dir to node name. If the user changes the node
name and does provide the data dir, etcd still cannot recover from previous state. But it
is much better than using nodeID. And it is actually a doucmentation issue.
Conflicts:
main.go
SSLv3 is no longer considered secure, and is not supported by golang
clients. Set the minimum version of all TLSConfigs that etcd uses to
ensure that only TLS >=1.0 can be used.
Since these are arrays instead of maps, the "keys" here are actually
(useless) goto labels. What really matters is that the ordering is
the same between the constant declarations and the array.
Before this commit, compact always compact log at current appliedindex of raft.
This prevents us from doing non-blocking snapshot since we have to make snapshot
and compact atomically. To prepare for non-blocking snapshot, this commit make
compact supports index and nodes parameters. After completing snapshot, the applier
should call compact with the snapshot index and the nodes at snapshot index to do
a compaction at snapsohot index.
In preperation for adding the ability to join a machine to an existing
cluster force the user to specify whether they expect this to me a new
cluster or an active one.
The error for not specifying the initial-cluster-state is:
```
etcd: initial cluster state unset and no wal found
```
send should not attach current term to msgProp. Send should simply do proxy for msgProp without
changing its term. msgProp has a special term 0, which indicates that it is a local message.
It's slightly unclear why we expose this timeout as being configurable,
and the `-timeout` flag does not exist in 0.4.x, so for now, remove the
flag until we have evidence that it is needed.
This adds a StartIndex field to the Watcher interface, which represents
the Etcd-Index at which the Watcher is created.
Also refactors the HTTP tests to use a table for most handleWatch tests
The usage of removed:
1. tell removed node about its removal explicitly using msgDenied
2. prevent removed node disrupt cluster progress by launching leader election
It is set when apply node removal, or receive msgDenied.
The -addr, -bind-addr, -peer-addr, and peer-bind-addr flags are
superseded by -advertise-client-urls, -listen-client-urls,
-advertise-peer-urls, and -listen-peer-urls, respectively.
If any of the former flags are provided, however, they will still
work. If the new counterparts to the old flags are provided, etcd
will log an error and fail.
This introduces two new concepts: the cluster and the member.
Members are logical etcd instances that have a name, raft ID, and a list
of peer and client addresses.
A cluster is made up of a list of members.
This prevents the bug like this:
1. a node sends join to a cluster and succeeds
2. it starts with empty peers and waits for sync, but it have not
received anything
3. election timeout passes, and it promotes itself to leader
4. it commits some log entry
5. its log conflicts with the cluster's
The work being done in the tests is completely wasted, as we do not
need to test the udnerlying x509 library. Faking out the parser function
allows the tests to run much faster without having to carry massive
fixtures, either.
This adds an EtcdIndex field to store.Event and uses that as the header
instead of the node's modifiedIndex. To facilitate this in a non-racy
way, we set the EtcdIndex while holding the lock.
Configure is used to propose config change. AddNode and RemoveNode is
used to apply cluster change to raft state machine. They are the
basics for dynamic configuration.
golang's concept of "zero" time does not align with the zero value of
a unix timestamp. Initializing time.Time with a unix timestamp of 0
makes time.Time.IsZero fail. Solve this by initializing time.Time only
if we care about the time.
This is to fix possible testing failures caused by sync tests.
Changes:
1. Get rid of time sleep operations, which introduces uncertainty.
2. Use fake Store.
This changes etcdserver.Server to an interface, with the former Server
(now "EtcdServer") becoming the canonical/production implementation.
This will facilitate better testing of the http server et al with mock
implementations of the interface.
It also more clearly defines the boundary for users of the Server.
The ReverseProxy code from the standard library doesn't actually
give us the control that we want. Pull it down and rip out what
we don't need, adding tests in the process.
All available endpoints are attempted when proxying a request. If a
proxied request fails, the upstream will be considered unavailable
for 5s and no more requests will be proxied to it. After the 5s is
up, the endpoint will be put back to rotation.
Make raftIndex section to be expected raftIndex of next entry.
It makes filename more intuitive and straight-forward.
The commit also adds comments for filename format.
golang convention dictates that the individual characters in an
abbreviation should all have the same case. Use ID instead of Id.
The protobuf generator still generates code that does not meet
this convention, but that's a fight for another day.
Add basic input validation of all query parameters supported by
serveKeys. Also restructures etcdhttp a bit to better facilitate
testing.
Test coverage is slightly improved.
Using 0xBEEF is annoying in examples because it makes it makes it look
like the user can use ascii or something. In the Procfile use
0x0,0x1,0x2,etc and use 0xBAD0 in test.
This adds a build script that attempts to be as user friendly as
possible: if they have already set $GOPATH and/or $GOBIN, use those
environment variables. If not, create a gopath for them in this
directory. This should facilitate both `go get` and `git clone` usage.
The `test` script is updated, and the new `cover` script facilitates
easy coverage generation for the repo's constituent packages by setting
the PKG environment variable.
Why do this? `go test ./...` has a ton of annoying output:
```
? github.com/coreos/etcd [no test files]
? github.com/coreos/etcd/crc [no test files]
? github.com/coreos/etcd/elog [no test files]
? github.com/coreos/etcd/error [no test files]
ok github.com/coreos/etcd/etcdserver 0.267s
ok github.com/coreos/etcd/etcdserver/etcdhttp 0.022s
? github.com/coreos/etcd/etcdserver/etcdserverpb [no test files]
ok github.com/coreos/etcd/raft 0.157s
? github.com/coreos/etcd/raft/raftpb [no test files]
ok github.com/coreos/etcd/snap 0.018s
? github.com/coreos/etcd/snap/snappb [no test files]
third_party/code.google.com/p/gogoprotobuf/proto/testdata/test.pb.go🔢
undefined: __emptyarchive__.Extension
ok github.com/coreos/etcd/store 4.247s
ok
github.com/coreos/etcd/third_party/code.google.com/p/go.net/context
2.724s
FAIL
github.com/coreos/etcd/third_party/code.google.com/p/gogoprotobuf/proto
[build failed]
ok
github.com/coreos/etcd/third_party/github.com/stretchr/testify/assert
0.013s
ok github.com/coreos/etcd/wait 0.010s
ok github.com/coreos/etcd/wal 0.024s
? github.com/coreos/etcd/wal/walpb [no test files]
```
And we have no had to manually configure drone.io which I want to avoid:
https://drone.io/github.com/coreos/etcd/admin
The documentation for these options has been incorrect ever since their
addition via commit 084dcb55. This broke upgrading a CoreOS stable
cluster to 410.0.0 because the user was using the example TOML config
which contains:
http_write_timeout = 10
This was silently ignored in the old stable version which predates the
addition of these options. After upgrading etcd began failing with:
Type mismatch for 'config.Config.http_write_timeout': Expected float
but found 'int64’.
Original issue: https://github.com/coreos/update_engine/issues/45
The default is for connections to last forever[1]. This leads to fds
leaking. I set the timeout so high by default so that watches don't have
to keep retrying but perhaps we should set it slower.
Tested on a cluster with lots of clients and it seems to have relieved
the problem.
[1] https://groups.google.com/forum/#!msg/golang-nuts/JFhGwh1q9xU/heh4J8pul3QJ
Original Commit: 084dcb5596
From: philips <brandon@ifup.org>
Many http clients will missbehave unless they get an initial http-
response, even when long-polling. It also saves the user/client from
having to handle headers on the first action of the watch, but rather
handle the response immediately.
Original commit: 2338481bb1
From: Christoffer Vikström
To make configuration change safe without adding configuration protocol:
1. We only allow to add/remove one node at a time.
2. We only allow one uncommitted configuration entry in the log.
These two rules can make sure there is no disjoint quorums in both current cluster and the
future(after applied any number of committed entries or uncommitted entries in log) clusters.
We add a type field in Entry structure for two reasons:
1. Statemachine needs to know if there is a pending configuration change.
2. Configuration entry should be executed by raft package rather application who is using raft.
@ -21,12 +21,13 @@ This is a rough outline of what a contributor's workflow looks like:
- Make sure your commit messages are in the proper format (see below).
- Push your changes to a topic branch in your fork of the repository.
- Submit a pull request to coreos/etcd.
- Your PR must receive a LGTM from two maintainers found in the MAINTAINERS file.
Thanks for your contributions!
### Code style
The coding style suggested by the Golang community is used in etcd. See [style doc](https://code.google.com/p/go-wiki/wiki/Style) for details.
The coding style suggested by the Golang community is used in etcd. See the [style doc](https://code.google.com/p/go-wiki/wiki/CodeReviewComments) for details.
Please follow this style to make etcd easy to review, maintain and develop.
When first started, etcd stores its configuration into a data directory specified by the data-dir configuration parameter.
Configuration is stored in the write ahead log and includes: the local member ID, cluster ID, and initial cluster configuration.
The write ahead log and snapshot files are used during member operation and to recover after a restart.
If a member’s data directory is ever lost or corrupted then the user should remove the etcd member from the cluster via the [members API][members-api].
A user should avoid restarting an etcd member with a data directory from an out-of-date backup.
Using an out-of-date data directory can lead to inconsistency as the member had agreed to store information via raft then re-joins saying it needs that information again.
For maximum safety, if an etcd member suffers any sort of data corruption or loss, it must be removed from the cluster.
Once removed the member can be re-added with an empty data directory.
If you are spinning up multiple clusters for testing it is recommended that you specify a unique initial-cluster-token for the different clusters.
This can protect you from cluster corruption in case of mis-configuration because two members started with different cluster tokens will refuse members from each other.
### Disaster Recovery
etcd is designed to be resilient to machine failures. An etcd cluster can automatically recover from any number of temporary failures (for example, machine reboots), and a cluster of N members can tolerate up to _(N/2)-1_ permanent failures (where a member can no longer access the cluster, due to hardware failure or disk corruption). However, in extreme circumstances, a cluster might permanently lose enough members such that quorum is irrevocably lost. For example, if a three-node cluster suffered two simultaneous and unrecoverable machine failures, it would be normally impossible for the cluster to restore quorum and continue functioning.
To recover from such scenarios, etcd provides functionality to backup and restore the datastore and recreate the cluster without data loss.
#### Backing up the datastore
The first step of the recovery is to backup the data directory on a functioning etcd node. To do this, use the `etcdctl backup` command, passing in the original data directory used by etcd. For example:
```sh
etcdctl backup \
--data-dir /var/lib/etcd \
--backup-dir /tmp/etcd_backup
```
This command will rewrite some of the metadata contained in the backup (specifically, the node ID and cluster ID), which means that the node will lose its former identity. In order to recreate a cluster from the backup, you will need to start a new, single-node cluster. The metadata is rewritten to prevent the new node from inadvertently being joined onto an existing cluster.
#### Restoring a backup
To restore a backup using the procedure created above, start etcd with the `-force-new-cluster` option and pointing to the backup directory. This will initialize a new, single-member cluster with the default advertised peer URLs, but preserve the entire contents of the etcd data store. Continuing from the previous example:
```sh
etcd \
-data-dir=/tmp/etcd_backup \
-force-new-cluster \
...
```
Now etcd should be available on this node and serving the original datastore.
Once you have verified that etcd has started successfully, shut it down and move the data back to the previous location (you may wish to make another copy as well to be safe):
```sh
pkill etcd
rm -fr /var/lib/etcd
mv /tmp/etcd_backup /var/lib/etcd
etcd \
-data-dir=/var/lib/etcd \
...
```
#### Restoring the cluster
Now that the node is running successfully, you can add more nodes to the cluster and restore resiliency. See the [runtime configuration](runtime-configuration.md) guide for more details.
This guide will walk you through configuring a three machine etcd cluster with the following details:
|Name |Address |
|-------|-----------|
|infra0 |10.0.1.10 |
|infra1 |10.0.1.11 |
|infra2 |10.0.1.12 |
## Static
As we know the cluster members, their addresses and the size of the cluster before starting we can use an offline bootstrap configuration. Each machine will get either the following command line or environment variables:
If you are spinning up multiple clusters (or creating and destroying a single cluster) with same configuration for testing purpose, it is highly recommended that you specify a unique `initial-cluster-token` for the different clusters. By doing this, etcd can generate unique cluster IDs and member IDs for the clusters even if they otherwise have the exact same configuration. This can protect you from cross-cluster-interaction, which might corrupt your clusters.
On each machine you would start etcd with these flags:
The command line parameters starting with `-initial-cluster` will be ignored on subsequent runs of etcd. You are free to remove the environment variables or command line flags after the initial bootstrap process. If you need to make changes to the configuration later see our guide on [runtime configuration](runtime-configuration.md).
### Error Cases
In the following case we have not included our new host in the list of enumerated nodes. If this is a new cluster, the node must be added to the list of initial cluster members.
etcd: infra1 not listed in the initial cluster config
exit 1
```
In this case we are attempting to map a node (infra0) on a different address (127.0.0.1:2380) than its enumerated address in the cluster list (10.0.1.10:2380). If this node is to listen on multiple addresses, all addresses must be reflected in the "initial-cluster" configuration directive.
etcd: conflicting cluster ID to the target cluster (c6ab534d07e8fcc4 != bc25ea2a74fb18b0). Exiting.
exit 1
```
## Discovery
In a number of cases you might not know the IPs of your cluster peers ahead of time. This is common when utilizing cloud providers or when your network uses DHCP. In these cases you can use an existing etcd cluster to bootstrap a new one. We call this process "discovery".
### Lifetime of a Discovery URL
A discovery URL identifies a unique etcd cluster. Instead of reusing a discovery URL, you should always create discovery URLs for new clusters.
Moreover, discovery URLs should ONLY be used for the initial bootstrapping of a cluster. To change cluster membership after the cluster is already running, see the [runtime reconfiguration][runtime] guide.
Discovery uses an existing cluster to bootstrap itself. If you are using your own etcd cluster you can create a URL like so:
```
$ curl -X PUT https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83/_config/size -d value=3
```
By setting the size key to the URL, you create a discovery URL with expected-cluster-size of 3.
The URL you will use in this case will be `https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83` and the etcd members will use the `https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83` directory for registration as they start.
Now we start etcd with those relevant flags for each member:
This will cause each member to register itself with the custom etcd discovery service and begin the cluster once all machines have been registered.
### Public discovery service
If you do not have access to an existing cluster you can use the public discovery service hosted at discovery.etcd.io. You can create a private discovery URL using the "new" endpoint like so:
etcd: warn: ignoring discovery URL: etcd has already been initialized and has a valid log in /var/lib/etcd
```
# 0.4 to 0.5+ Migration Guide
In etcd 0.5 we introduced the ability to listen on more than one address and to advertise multiple addresses. This makes using etcd easier when you have complex networking, such as private and public networks on various cloud providers.
To make understanding this feature easier, we changed the naming of some flags, but we support the old flags to make the migration from the old to new version easier.
|-peer-addr |-initial-advertise-peer-urls |If specified, peer-addr will be used as the only peer URL. Error if both flags specified.|
|-addr |-advertise-client-urls |If specified, addr will be used as the only client URL. Error if both flags specified.|
|-peer-bind-addr |-listen-peer-urls |If specified, peer-bind-addr will be used as the only peer bind URL. Error if both flags specified.|
|-bind-addr |-listen-client-urls |If specified, bind-addr will be used as the only client bind URL. Error if both flags specified.|
|-peers |none |Deprecated. The -initial-cluster flag provides a similar concept with different semantics. Please read this guide on cluster startup.|
|-peers-file |none |Deprecated. The -initial-cluster flag provides a similar concept with different semantics. Please read this guide on cluster startup.|
This document defines the various terms used in etcd documentation, command line and source code.
### Node
Node is an instance of raft state machine.
It has a unique identification, and records other nodes' progress internally when it is the leader.
### Member
Member is an instance of etcd. It hosts a node, and provides service to clients.
### Cluster
Cluster consists of several members.
The node in each member follows raft consensus protocol to replicate logs. Cluster receives proposals from members, commits them and apply to local store.
Return an HTTP 200 OK response code and a representation of all members in the etcd cluster.
### Request
```
GET /v2/members HTTP/1.1
```
### Example
```
curl http://10.0.0.10:2379/v2/members
```
```json
{
"members":[
{
"id":"272e204152",
"name":"infra1",
"peerURLs":[
"http://10.0.0.10:2380"
],
"clientURLs":[
"http://10.0.0.10:2379"
]
},
{
"id":"2225373f43",
"name":"infra2",
"peerURLs":[
"http://10.0.0.11:2380"
],
"clientURLs":[
"http://10.0.0.11:2379"
]
},
]
}
```
## Add a member
Returns an HTTP 201 response code and the representation of added member with a newly generated a memberID when successful. Returns a string describing the failure condition when unsuccessful.
If the POST body is malformed an HTTP 400 will be returned. If the member exists in the cluster or existed in the cluster at some point in the past an HTTP 409 will be returned. If any of the given peerURLs exists in the cluster an HTTP 409 will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
Remove a member from the cluster. The member ID must be a hex-encoded uint64.
Returns empty when successful. Returns a string describing the failure condition when unsuccessful.
If the member does not exist in the cluster an HTTP 500(TODO: fix this) will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
etcd comes with support for incremental runtime reconfiguration, which allows users to update the membership of the cluster at run time.
## Reconfiguration Use Cases
Let us walk through the four use cases for re-configuring a cluster: replacing a member, increasing or decreasing cluster size, and restarting a cluster from a majority failure.
### Replace a Member
The most common use case of cluster reconfiguration is to replace a member because of a permanent failure of the existing member: for example, hardware failure, loss of network address, or data directory corruption.
It is important to replace failed members as soon as the failure is detected.
If etcd falls below a simple majority of members it can no longer accept writes: e.g. in a 3 member cluster the loss of two members will cause writes to fail and the cluster to stop operating.
### Increase Cluster Size
To make your cluster more resilient to machine failure you can increase the size of the cluster.
For example, if the cluster consists of three machines, it can tolerate one failure.
If we increase the cluster size to five, it can tolerate two machine failures.
Increasing the cluster size can also provide better read performance.
When a client accesses etcd, the normal read gets the data from the local copy of each member (members always shares the same view of the cluster at the same index, which is guaranteed by the sequential consistency of etcd).
Since clients can read from any member, increasing the number of members thus increases overall read throughput.
### Decrease Cluster Size
To improve the write performance of a cluster, you might want to trade off resilience by removing members.
etcd replicates the data to the majority of members of the cluster before committing the write.
Decreasing the cluster size means the etcd cluster has to do less work for each write, thus increasing the write performance.
### Restart Cluster from Majority Failure
If the majority of your cluster is lost, then you need to take manual action in order to recover safely.
The basic steps in the recovery process include creating a new cluster using the old data, forcing a single member to act as the leader, and finally using runtime configuration to add members to this new cluster.
TODO: https://github.com/coreos/etcd/issues/1242
## Cluster Reconfiguration Operations
Now that we have the use cases in mind, let us lay out the operations involved in each.
Before making any change, the simple majority (quorum) of etcd members must be available.
This is essentially the same requirement as for any other write to etcd.
All changes to the cluster are done one at a time:
To replace a single member you will make an add then a remove operation
To increase from 3 to 5 members you will make two add operations
To decrease from 5 to 3 you will make two remove operations
All of these examples will use the `etcdctl` command line tool that ships with etcd.
If you want to use the member API directly you can find the documentation [here](https://github.com/coreos/etcd/blob/master/Documentation/0.5/other_apis.md).
Let us say the member ID we want to remove is a8266ecf031671f3.
We then use the `remove` command to perform the removal:
```
$ etcdctl member remove a8266ecf031671f3
Removed member a8266ecf031671f3 from cluster
```
The target member will stop itself at this point and print out the removal in the log:
```
etcd: this member has been permanently removed from the cluster. Exiting.
```
Removal of the leader is safe, but the cluster will be out of progress for a period of election timeout because it needs to elect the new leader.
### Add a Member
Adding a member is a two step process:
* Add the new member to the cluster via the [members API](https://github.com/coreos/etcd/blob/master/Documentation/0.5/other_apis.md#post-v2members) or the `etcdctl member add` command.
* Start the member with the correct configuration.
Using `etcdctl` let's add the new member to the cluster:
The new member will run as a part of the cluster and immediately begin catching up with the rest of the cluster.
If you are adding multiple members the best practice is to configure the new member, then start the process, then configure the next, and so on.
A common case is increasing a cluster from 1 to 3: if you add one member to a 1-node cluster, the cluster cannot make progress before the new member starts because it needs two members as majority to agree on the consensus.
#### Error Cases
In the following case we have not included our new host in the list of enumerated nodes.
If this is a new cluster, the node must be added to the list of initial cluster members.
@ -13,11 +13,11 @@ Please note - at least 3 nodes are required for [cluster availability][optimal-c
## Using discovery.etcd.io
### Create a Token
### Create a Discovery URL
To use the discovery API, you must first create a token for your etcd cluster. Visit [https://discovery.etcd.io/new](https://discovery.etcd.io/new) to create a new token.
To use the discovery API, you must first create a unique discovery URL for your etcd cluster. Visit [https://discovery.etcd.io/new](https://discovery.etcd.io/new) to create a new discovery URL.
You can inspect the list of peers by viewing `https://discovery.etcd.io/<token>`.
You can inspect the list of peers by viewing `https://discovery.etcd.io/<cluster id>`.
### Start etcd With the Discovery Flag
@ -26,10 +26,10 @@ Specify the `-discovery` flag when you start each etcd instance. The list of exi
The discovery API communicates with a separate etcd cluster to store and retrieve the list of peers. CoreOS provides [https://discovery.etcd.io](https://discovery.etcd.io) as a free service, but you can easily run your own etcd cluster for this purpose. Here's an example using an etcd cluster located at `10.10.10.10:4001`:
If you're interested in how to discovery API works behind the scenes, read about the [Discovery Protocol](https://github.com/coreos/etcd/blob/master/Documentation/discovery-protocol.md).
@ -167,3 +167,9 @@ Etcd can also do internal server-to-server communication using SSL client certs.
To do this just change the `-*-file` flags to `-peer-*-file`.
If you are using SSL for server-to-server communication, you must use it on all instances of etcd.
### Bootstrapping a new cluster by name
An etcd server is uniquely defined by the peer addresses it listens to. Suppose, however, that you wish to start over, while maintaining the data from the previous cluster -- that is, to pretend that this machine has never joined a cluster before.
You can use `--initial-cluster-name` to generate a new unique ID for each node, as a shared token that every node understands. Nodes also take this into account for bootstrapping the new cluster ID, so it also provides a way for a machine to listen on the same interfaces, disconnect from one cluster, and join a different cluster.
Etcd supports SSL/TLS as well as authentication through client certificates, both for clients to server as well as peer (server to server / cluster) communication.
Etcd supports SSL/TLS and client cert authentication for clients to server, as well as server to server communication.
To get up and running you first need to have a CA certificate and a signed key pair for your node. It is recommended to create and sign a new key pair for every node in a cluster.
For convenience the [etcd-ca](https://github.com/coreos/etcd-ca) tool provides an easy interface to certificate generation, alternatively this site provides a good reference on how to generate self-signed key pairs:
First, you need to have a CA cert `clientCA.crt` and signed key pair `client.crt`, `client.key`.
This site has a good reference for how to generate self-signed key pairs:
http://www.g-loaded.eu/2005/11/10/be-your-own-ca/
Or you could use [etcd-ca](https://github.com/coreos/etcd-ca) to generate certs and keys.
For testing you can use the certificates in the `fixtures/ca` directory.
## Basic setup
Let's configure etcd to use this keypair:
Etcd takes several certificate related configuration options, either through command-line flags or environment variables:
**Client-to-server communication:**
`--cert-file=<path>`: Certificate used for SSL/TLS connections **to** etcd. When this option is set, you can reach etcd through HTTPS - for example at `https://127.0.0.1:4001`
`--key-file=<path>`: Key for the certificate. Must be unencrypted.
`--ca-file=<path>`: When this is set etcd will check all incoming HTTPS requests for a client certificate signed by the supplied CA, requests that don't supply a valid client certificate will fail.
The peer options work the same way as the client-to-server options:
`--peer-cert-file=<path>`: Certificate used for SSL/TLS connections between peers. This will be used both for listening on the peer address as well as sending requests to other peers.
`--peer-key-file=<path>`: Key for the certificate. Must be unencrypted.
`--peer-ca-file=<path>`: When set, etcd will check all incoming peer requests from the cluster for valid client certificates signed by the supplied CA.
If either a client-to-server or peer certificate is supplied the key must also be set. All of these configuration options are also available through the environment variables, `ETCD_CA_FILE`, `ETCD_PEER_CA_FILE` and so on.
## Example 1: Client-to-server transport security with HTTPS
For this you need your CA certificate (`ca.crt`) and signed key pair (`server.crt`, `server.key`) ready. If you just want to test the functionality, there are example certificates provided in the [etcd git repository](https://github.com/coreos/etcd/tree/master/fixtures/ca) (namely `server.crt` and `server.key.insecure`).
Assuming you have these files ready, let's configure etcd to use them to provide simple HTTPS transport security.
*`-f` - forces a new machine configuration, even if an existing configuration is found. (WARNING: data loss!)
*`-cert-file` and `-key-file` specify the location of the cert and key files to be used for for transport layer security between the client and server.
You can now test the configuration using HTTPS:
This should start up fine and you can now test the configuration by speaking HTTPS to etcd:
You should be able to see the handshake succeed. Because we use self-signed certificates with our own certificate authorities you need to provide the CA to curl using the `--cacert` option. Another possibility would be to add your CA certificate to the trusted certificates on your system (usually in `/etc/ssl/certs`).
**OSX 10.9+ Users**: curl 7.30.0 on OSX 10.9+ doesn't understand certificates passed in on the command line.
Instead you must import the dummy ca.crt directly into the keychain or add the `-k` flag to curl to ignore errors.
@ -36,42 +52,28 @@ If you want to test without the `-k` flag run `open ./fixtures/ca/ca.crt` and fo
Please remove this certificate after you are done testing!
If you know of a workaround let us know.
```
...
SSLv3, TLS handshake, Finished (20):
...
```
## Example 2: Client-to-server authentication with HTTPS client certificates
And also the response from the etcd server:
For now we've given the etcd client the ability to verify the server identity and provide transport security. We can however also use client certificates to prevent unauthorized access to etcd.
```json
{
"action":"set",
"key":"/foo",
"modifiedIndex":3,
"value":"bar"
}
```
The clients will provide their certificates to the server and the server will check whether the cert is signed by the supplied CA and decide whether to serve the request.
## Authentication with HTTPS Client Certificates
We can also do authentication using CA certs.
The clients will provide their cert to the server and the server will check whether the cert is signed by the CA and decide whether to serve the request.
You need the same files mentioned in the first example for this, as well as a key pair for the client (`client.crt`, `client.key`) signed by the same certificate authority.
@ -108,7 +110,28 @@ And also the response from the server:
}
```
### Why SSLv3 alert handshake failure when using SSL client auth?
## Example 3: Transport security & client certificates in a cluster
Etcd supports the same model as above for **peer communication**, that means the communication between etcd nodes in a cluster.
Assuming we have our `ca.crt` and two nodes with their own keypairs (`node1.crt`&`node1.key`, `node2.crt`&`node2.key`) signed by this CA, we launch etcd as follows:
```sh
DISCOVERY_URL=... # from https://discovery.etcd.io/new
The etcd nodes will form a cluster and all communication between nodes in the cluster will be encrypted and authenticated using the client certificates. You will see in the output of etcd that the addresses it connects to use HTTPS.
## Frequently Asked Questions
### I'm seeing a SSLv3 alert handshake failure when using SSL client authentication?
The `crypto/tls` package of `golang` checks the key usage of the certificate public key before using it.
To use the certificate public key to do client auth, we need to add `clientAuth` to `Extended Key Usage` when creating the certificate public key.
@ -129,3 +152,8 @@ When creating the cert be sure to reference it in the `-extensions` flag:
### With peer certificate authentication I receive "certificate is valid for 127.0.0.1, not $MY_IP"
Make sure that you sign your certificates with a Subject Name your node's public IP address. The `etcd-ca` tool for example provides an `--ip=` option for its `new-cert` command.
If you need your certificate to be signed for your node's FQDN in its Subject Name then you could use Subject Alternative Names (short IP SNAs) to add your IP address. This is not [currently supported](https://github.com/coreos/etcd-ca/issues/29) by `etcd-ca` but can be done [with openssl](http://wiki.cacert.org/FAQ/subjectAltName).
The default settings in etcd should work well for installations on a local network where the average network latency is low.
However, when using etcd across multiple data centers or over networks with high latency you may need to tweak the heartbeat interval and election timeout settings.
The network isn't the only source of latency. Each request and response may be impacted by slow disks on both the leader and follower. Each of these timeouts represents the total time from request to successful response from the other machine.
### Time Parameters
The underlying distributed consensus protocol relies on two separate time parameters to ensure that nodes can handoff leadership if one stalls or goes offline.
@ -41,7 +43,7 @@ Or you can set the values within the configuration file:
cli.go is simple, fast, and fun package for building command line apps in Go. The goal is to enable developers to write fast and distributable command line applications in an expressive way.
You can view the API docs here:
http://godoc.org/github.com/codegangsta/cli
## Overview
Command line apps are usually so tiny that there is absolutely no reason why your code should *not* be self-documenting. Things like generating help text and parsing command flags/options should not hinder productivity when writing a command line app.
**This is where cli.go comes into play.** cli.go makes command line programming fun, organized, and expressive!
## Installation
Make sure you have a working Go environment (go 1.1 is *required*). [See the install instructions](http://golang.org/doc/install.html).
To install `cli.go`, simply run:
```
$ go get github.com/codegangsta/cli
```
Make sure your `PATH` includes to the `$GOPATH/bin` directory so your commands can be easily used:
```
export PATH=$PATH:$GOPATH/bin
```
## Getting Started
One of the philosophies behind cli.go is that an API should be playful and full of discovery. So a cli.go app can be as little as one line of code in `main()`.
``` go
package main
import (
"os"
"github.com/codegangsta/cli"
)
func main() {
cli.NewApp().Run(os.Args)
}
```
This app will run and show help text, but is not very useful. Let's give an action to execute and some help documentation:
``` go
package main
import (
"os"
"github.com/codegangsta/cli"
)
func main() {
app := cli.NewApp()
app.Name = "boom"
app.Usage = "make an explosive entrance"
app.Action = func(c *cli.Context) {
println("boom! I say!")
}
app.Run(os.Args)
}
```
Running this already gives you a ton of functionality, plus support for things like subcommands and flags, which are covered below.
## Example
Being a programmer can be a lonely job. Thankfully by the power of automation that is not the case! Let's create a greeter app to fend off our demons of loneliness!
``` go
/* greet.go */
package main
import (
"os"
"github.com/codegangsta/cli"
)
func main() {
app := cli.NewApp()
app.Name = "greet"
app.Usage = "fight the loneliness!"
app.Action = func(c *cli.Context) {
println("Hello friend!")
}
app.Run(os.Args)
}
```
Install our command to the `$GOPATH/bin` directory:
help, h Shows a list of commands or help for one command
GLOBAL OPTIONS
--version Shows version information
```
### Arguments
You can lookup arguments by calling the `Args` function on `cli.Context`.
``` go
...
app.Action = func(c *cli.Context) {
println("Hello", c.Args()[0])
}
...
```
### Flags
Setting and querying flags is simple.
``` go
...
app.Flags = []cli.Flag {
cli.StringFlag{
Name: "lang",
Value: "english",
Usage: "language for the greeting",
},
}
app.Action = func(c *cli.Context) {
name := "someone"
if len(c.Args()) > 0 {
name = c.Args()[0]
}
if c.String("lang") == "spanish" {
println("Hola", name)
} else {
println("Hello", name)
}
}
...
```
#### Alternate Names
You can set alternate (or short) names for flags by providing a comma-delimited list for the `Name`. e.g.
``` go
app.Flags = []cli.Flag {
cli.StringFlag{
Name: "lang, l",
Value: "english",
Usage: "language for the greeting",
},
}
```
#### Values from the Environment
You can also have the default value set from the environment via `EnvVar`. e.g.
``` go
app.Flags = []cli.Flag {
cli.StringFlag{
Name: "lang, l",
Value: "english",
Usage: "language for the greeting",
EnvVar: "APP_LANG",
},
}
```
That flag can then be set with `--lang spanish` or `-l spanish`. Note that giving two different forms of the same flag in the same command invocation is an error.
### Subcommands
Subcommands can be defined for a more git-like command line app.
Feel free to put up a pull request to fix a bug or maybe add a feature. I will give it a code review and make sure that it does not break backwards compatibility. If I or any other collaborators agree that it is in line with the vision of the project, we will work with you to get the code into a mergeable state and merge it into the master branch.
If you are have contributed something significant to the project, I will most likely add you as a collaborator. As a collaborator you are given the ability to merge others pull requests. It is very important that new code does not break existing code, so be careful about what code you do choose to merge. If you have any questions feel free to link @codegangsta to the issue in question and we can review it together.
If you feel like you have contributed to the project but have not yet been added as a collaborator, I probably forgot to add you. Hit @codegangsta up over email and we will get it figured out.
## About
cli.go is written by none other than the [Code Gangsta](http://codegangsta.io)
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.