Commit Graph

48 Commits

Author SHA1 Message Date
d4e86108e3 Merge pull request #17000 from siyuanfoundation/livez-bp-3.5-step1
[3.5] Backport healthcheck code cleanup
2023-11-27 19:59:40 +01:00
46e394242f server: Split metrics and health code
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-27 09:31:00 -08:00
8ab1c0f25b server: Cover V3 health with tests
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-27 09:31:00 -08:00
5a564d56d7 Backport server: Have tracingExporter own resources it initialises.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-11-26 10:22:10 +13:00
db16069588 backport #14125 to release-3.5: Update to grpc-1.47 (and fix the connection-string format)
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-10-12 09:46:49 -07:00
073c530989 server: Fix defer function closure escape
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 16:01:29 +02:00
c0421c7330 server: Add --listen-client-http-urls flag to allow running grpc server separate from http server
Difference in load configuration for watch delay tests show how huge the
impact is. Even with random write scheduler grpc under http
server can only handle 500 KB with 2 seconds delay. On the other hand,
separate grpc server easily hits 10, 100 or even 1000 MB within 100 miliseconds.

Priority write scheduler that was used in most previous releases
is far worse than random one.

Tests configured to only 5 MB to avoid flakes and taking too long to fill
etcd.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 15:53:11 +02:00
2d5f48a7ef server: Pick one address that all grpc gateways connect to
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 15:11:59 +02:00
a9e0a04c9a server: Extract resolveUrl helper function
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 15:11:59 +02:00
245067b15d server: Separate client listener grouping from serving
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 15:11:59 +02:00
63576a25f5 refactor: Use proper variable names for urls
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 15:11:58 +02:00
358bcf3fb6 Backport tls 1.3 support.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-03-15 14:10:14 +13:00
f6c4c84da3 etcdserver: added more debug log for the purgeFile goroutine
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-10-12 19:28:32 +08:00
5660bf0e7f server: Make corrtuption check optional and period configurable
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:56 +02:00
4a75e3d52d server: Refactor compaction checker
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
35cbdf3961 server: Extract corruption detection to dedicated struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
9ea5b1ba22 Refactor the keepAliveListener and keepAliveConn
Only `net.TCPConn` supports `SetKeepAlive` and `SetKeepAlivePeriod`
by default, so if you want to warp multiple layers of net.Listener,
the `keepaliveListener` should be the one which is closest to the
original `net.Listener` implementation, namely `TCPListener`.

Also refer to https://github.com/etcd-io/etcd/pull/14356

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-20 15:03:15 +08:00
8fdca41cd8 Change default sampling rate from 100% to 0%
Refer to https://github.com/etcd-io/etcd/pull/14318

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-07 07:19:30 +08:00
5a86ae2c33 move setupTracing into a separate file config_tracing.go
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-07 07:17:27 +08:00
4c013c91e9 Change default sampling rate from 100% to 0%
This changes the default parent-based trace sampling rate from
100% to 0%. Due to the high QPS etcd can handle, having 100% trace
sampling leads to very high resource usage. Defaulting to 0% means
that only already-sampled traces will be sampled in etcd.

Fixes #14310

Signed-off-by: Mike Dame <mikedame@google.com>
2022-08-05 15:00:40 +00:00
437f3778d0 Add flag --max-concurrent-streams to set the max concurrent stream each client can open at a time
Also refer to https://github.com/etcd-io/etcd/pull/14169#discussion_r917154243

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-13 14:18:15 +08:00
d00e89db2e server: Require either cluster version v3.6 or --experimental-enable-lease-checkpoint-persist to persist lease remainingTTL
To avoid inconsistant behavior during cluster upgrade we are feature
gating persistance behind cluster version. This should ensure that
all cluster members are upgraded to v3.6 before changing behavior.

To allow backporting this fix to v3.5 we are also introducing flag
--experimental-enable-lease-checkpoint-persist that will allow for
smooth upgrade in v3.5 clusters with this feature enabled.
2021-12-02 16:54:10 +01:00
58d2b12a50 client: Add grpc authority header integration tests 2021-09-30 12:15:32 +02:00
85341e08f2 Merge pull request #12968 from serathius/logger-simplify
server: Simplify passing logger setup by passing only logger
2021-05-15 15:58:00 +02:00
41ed74824e server: Simplify passing logger setup by passing only logger 2021-05-14 13:14:48 +02:00
ead81df948 Disallow -v2-deprecation>'not-yet' combined with --enable-v2 2021-05-12 18:09:34 +02:00
e0a8484c8f Merge pull request #12941 from serathius/defrag
etcdserver: Implement running defrag if freeable space will exceed provided threshold (on boot)
2021-05-12 09:26:56 +02:00
efc8505739 etcdserver: Implement running defrag if freeable space will exceed privided threshold 2021-05-11 14:00:29 +02:00
269f22c837 Deprecate V2 API: --enable-v2 and v2v3
Flags `--experimental-enable-v2v3` and '-enable-v2' will raise a warning in 3.5,
in 3.6 they are schedule for decomissioning, such that v2store can stop be written in 3.7.

Deprecation plan in: https://github.com/etcd-io/etcd/issues/12913
2021-05-10 16:19:52 +02:00
1a718a958e Add initial Tracing with OpenTelemetry 2021-05-10 10:44:40 +02:00
344c9f3930 Merge pull request #12896 from wilsonwang371/profiling-txn2
server: make applier use ReadTx() in Txn() instead of ConcurrentReadTx()
2021-05-06 01:59:14 -07:00
98083ea914 server: add experimental flag for using shared buffer in transacton write 2021-05-04 11:59:08 -07:00
f53b70facb Embed: In case KVStoreHash verification fails, close the backend.
In case of failed verification, the server used to keep opened backend
(so the file was locked on OS level).
2021-04-29 11:51:25 +02:00
c4b13a5c83 Integrate verification framework
Verification framework is integrated with:
  - integration tests (by default)
  - `ETCD_VERIFY=all etcdctl snapshot restore` command
  - etcd shutdown when running with `ETCD_VERIFY=all` env.
2021-04-28 07:56:16 +02:00
b47c5fcc12 Address review comments a.d. logging. 2021-04-15 17:54:37 +02:00
e776efbb2a Merge pull request #12828 from ptabor/20210404-embed-etcd
embed: etcd.Close() is closing Errc() channel as well.
2021-04-08 01:20:07 +02:00
5da9cac193 embed: etcd.Close() is closing Errc() channel as well.
Inspired by https://github.com/etcd-io/etcd/pull/9612 by purpleidea@.
2021-04-08 01:19:13 +02:00
3bb7acc8cf Migrate dependencies pkg/foo -> client/pkg/foo 2021-04-07 00:38:47 +02:00
18382aa234 Fix 2 sources of leaked memory: embed server HTTP & v3_snapshot.leasser. 2021-03-16 22:20:00 +01:00
fd7fed1511 Move config (ServerConfig) out of etcdserver package.
Motivation:
  - ServerConfig is part of 'embed' public API, while etcdserver is more 'internal'
  - EtcdServer is already too big and config is pretty wide-spread leaf
if we were to split etcdserver (e.g. into pre & post-apply part).
2021-03-11 20:56:22 +01:00
94a371acd7 Merge pull request #12750 from ptabor/20210306-mlock
--experimental-memory-mlock support
2021-03-09 09:13:40 -08:00
5b49fb41c8 fixup: add ListenerOptions
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-03-08 11:27:03 -05:00
a46a358577 --experimental-memory-mlock support
The flag protects etcd memory from being swapped out to disk.
This can happen in memory constrained systems where mmaped bbolt
area is natural condidate for swapping out.

This flag should provide better tail latency on the cost of higher RSS
ram usage. If the experiment is successful, the logic should get moved
into bbolt layer, where we can protect specific bbolt instances
(e.g. avoid protecting both during defragmentation).
2021-03-07 12:32:57 +01:00
49078c683b *: add support for socket options
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-02-19 13:31:23 -05:00
3f6e0ec94b fix: pass argument url in defer to avoid loopclosure
Because of the well-known range loop closure issue, the value of u may
have changed by the time the anonymous function mentioned in the defer
is run. To address this, the simplest fix is to pass the url used in the
loop as an argument to the function run in defer.
2020-11-19 15:29:26 -06:00
c1c681adc3 server: Added config parameter experimental-warning-apply-duration 2020-11-17 17:33:19 -05:00
aaf423e962 server: Update imports.
find -name '*.go' | xargs sed -i --follow-symlinks 's|etcd/v3/|etcd/server/v3/|g'
2020-10-26 13:02:32 +01:00
4a5e9d1261 server: Move server files to 'server' directory.
26  git mv mvcc wal auth etcdserver etcdmain proxy embed/ lease/ server
   36  git mv go.mod go.sum server
2020-10-26 12:57:19 +01:00