Compare commits

...

35 Commits

Author SHA1 Message Date
90a831e5b8 build(deps): bump google.golang.org/grpc from 1.59.0 to 1.60.0
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.59.0 to 1.60.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.59.0...v1.60.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-12-18 17:19:59 +00:00
f7be2dfa17 Merge pull request #16888 from greenmoon55/greenmoon55-patch-1
tests: add comments for clientv3test.TestWatchResumeInitRev
2023-12-16 20:35:43 +00:00
4e986363a3 Merge pull request #16822 from ahrtr/revoke_20231024
Ignore old leader's leases revoking request
2023-12-15 18:44:03 +00:00
f7ff898fd6 Resovle some review comments
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-15 17:53:36 +00:00
67f17166bf Safeguard lease operations by double checking the leadership
1. ignore old leader's leases revoking request
2. double check current member's leadership before perform lease renew request
3. etcdserve: ensure current member's leadership before performing lease checkpoint request

Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-15 17:53:36 +00:00
f7e488dc92 Add e2e test cases to reproduce the lease revoke issue
Refer to https://github.com/etcd-io/etcd/issues/15247

Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-15 17:53:36 +00:00
2f03bbc5dd Merge pull request #17125 from ahrtr/TestMemberReplace_20231215
Update test case TestMemberReplace to always connect to stable endpoints
2023-12-15 16:17:04 +00:00
9590a02f94 update test case TestMemberReplace to always connect to stable endpoints
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-15 12:38:27 +00:00
51bd8bd4e8 Merge pull request #17107 from redwrasse/redwrasse/return-early-expectfun
testutils: return early instead of first breaking in LogObserver.Expe…
2023-12-15 09:53:40 +00:00
5c476cc9e9 Merge pull request #17115 from ivanvc/update-3.4-changelog-with-ssrf-fix
changelog: update 3.4 changelog to include ssrf fix
2023-12-15 09:51:06 +00:00
2cf112f3b9 Merge pull request #17117 from silves-xiang/main
etcdclient: Fix memory leak caused by for + time.After
2023-12-14 15:17:58 +00:00
a70fa9b471 Merge pull request #17113 from ahrtr/etcd_log_20231213
Added some log messages for better diagnosis
2023-12-14 12:50:49 +00:00
ed01ee1e5e etcdclient: Fix memory leak caused by for + time.After
Signed-off-by: silves-xiang <xiang20010326@sina.com>
2023-12-14 10:05:51 +08:00
616c5a47de changelog: update 3.4 changelog to include ssrf fix
Signed-off-by: Ivan Valdes <ivan@vald.es>
2023-12-13 15:16:17 -08:00
36b2523669 added some log messages for better diagnosis
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-13 18:43:22 +00:00
dfdffe48f9 Merge pull request #17102 from ahrtr/actuated_badge_20231212
Add actuated badge
2023-12-13 14:32:10 +00:00
68565c5ed7 Merge pull request #17108 from redwrasse/redwrasse/contrib-readmes
contrib: add missing lock and mixin readme descriptions
2023-12-13 10:05:41 +01:00
c89ee6f120 contrib: add missing lock and mixin readme descriptions
Signed-off-by: redwrasse <mail@redwrasse.io>
2023-12-12 23:48:07 -08:00
8a7596304a testutils: return early instead of first breaking in LogObserver.ExpectFunc
Signed-off-by: redwrasse <mail@redwrasse.io>
2023-12-12 22:24:00 -08:00
d298130eb0 Merge pull request #17104 from jmhbnz/update-depdenencies
[2023-12-13] Bump dependencies identified by dependabot
2023-12-12 20:26:15 +00:00
ef0a2903ce depdendency: bump github.com/prometheus/client_model from 0.4.1-0.20230718164431-9a2bf3000d16 to 0.5.0.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-12-13 08:40:24 +13:00
dc76bf4af2 Add actuated badge
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-12 15:23:54 +00:00
e3324dd128 Merge pull request #17079 from ZhouJianMS/member-replace
Add member replace e2e test
2023-12-12 10:13:11 +00:00
c230928240 Merge pull request #17089 from fykaa/main
Adjusted RAM Requirements for arm64 Workflows
2023-12-12 10:08:25 +00:00
1bfed3a0b6 Merge pull request #17091 from etcd-io/dependabot/github_actions/github/codeql-action-2.22.9
build(deps): bump github/codeql-action from 2.22.8 to 2.22.9
2023-12-12 09:42:17 +00:00
0e8b9b2ef2 Adjusted RAM Requirements for arm64 Workflows
Signed-off-by: fykaa <faeka6@gmail.com>
2023-12-12 14:25:42 +05:30
54822c47e9 member replace e2e test
Signed-off-by: ZhouJianMS <zhoujian@microsoft.com>
2023-12-12 15:22:22 +08:00
f0d85826c9 Merge pull request #17090 from etcd-io/dependabot/github_actions/actions/setup-go-5.0.0
build(deps): bump actions/setup-go from 4.1.0 to 5.0.0
2023-12-11 19:45:08 +00:00
9ffd2d51f3 Merge pull request #17088 from ahrtr/gofail_20231211
Install gofail in module-aware mode and ignore go.mod file
2023-12-11 19:10:50 +00:00
fe61388dcf tuned memory allocation for arm64 workflows of e2e and tests-template yaml file
Signed-off-by: fykaa <faeka6@gmail.com>
2023-12-11 23:20:32 +05:30
365a3cc7d1 build(deps): bump actions/setup-go from 4.1.0 to 5.0.0
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 4.1.0 to 5.0.0.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](93397bea11...0c52d547c9)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-12-11 17:06:02 +00:00
3ab54f720f install gofail in module-aware mode and ignore go.mod file
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-11 12:37:05 +00:00
4c853774e6 Rename the test and update comments
Signed-off-by: Jin Dong <greenmoon55@gmail.com>
2023-11-28 02:32:50 +00:00
f2d718e641 Merge branch 'etcd-io:main' into greenmoon55-patch-1 2023-11-27 21:14:59 -05:00
00ce0116c5 tests: add comments for clientv3test.TestWatchResumeInitRev
Signed-off-by: Jin Dong <greenmoon55@gmail.com>
2023-11-08 00:04:48 -06:00
36 changed files with 407 additions and 58 deletions

View File

@ -23,7 +23,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:

View File

@ -9,7 +9,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- run: |

View File

@ -14,7 +14,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:

View File

@ -6,7 +6,7 @@ jobs:
test:
# this is to prevent the job to run at forked projects
if: github.repository == 'etcd-io/etcd'
runs-on: actuated-arm64-8cpu-32gb
runs-on: actuated-arm64-8cpu-8gb
strategy:
fail-fast: false
matrix:
@ -18,7 +18,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:

View File

@ -15,7 +15,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:

View File

@ -13,7 +13,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- run: |

View File

@ -9,7 +9,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- run: date

View File

@ -15,7 +15,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:

View File

@ -9,7 +9,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- name: release

View File

@ -27,7 +27,7 @@ jobs:
count: 80
testTimeout: 200m
artifactName: main-arm64
runs-on: "['actuated-arm64-8cpu-32gb']"
runs-on: "['actuated-arm64-8cpu-8gb']"
release-35:
uses: ./.github/workflows/robustness-template.yaml
with:
@ -43,7 +43,7 @@ jobs:
count: 100
testTimeout: 200m
artifactName: release-35-arm64
runs-on: "['actuated-arm64-8cpu-32gb']"
runs-on: "['actuated-arm64-8cpu-8gb']"
release-34:
uses: ./.github/workflows/robustness-template.yaml
with:

View File

@ -32,7 +32,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- name: test-robustness

View File

@ -18,4 +18,4 @@ jobs:
count: 12
testTimeout: 30m
artifactName: main-arm64
runs-on: "['actuated-arm64-8cpu-32gb']"
runs-on: "['actuated-arm64-8cpu-8gb']"

View File

@ -9,7 +9,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- id: golangci_lint_version

View File

@ -31,7 +31,7 @@ jobs:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@93397bea11091df50f3d7e59dc26a7711a8bcfbe # v4.1.0
- uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:

View File

@ -12,4 +12,4 @@ jobs:
uses: ./.github/workflows/tests-template.yaml
with:
arch: arm64
runs-on: actuated-arm64-8cpu-32gb
runs-on: actuated-arm64-8cpu-8gb

View File

@ -4,6 +4,9 @@ Previous change logs can be found at [CHANGELOG-3.3](https://github.com/etcd-io/
## v3.4.29 (tbd)
### etcd server
- [Disable following HTTP redirects in peer communication](https://github.com/etcd-io/etcd/pull/17112)
### Dependencies
- Compile binaries using go [1.20.12](https://github.com/etcd-io/etcd/pull/17076).

View File

@ -9,6 +9,7 @@
[![Releases](https://img.shields.io/github/release/etcd-io/etcd/all.svg?style=flat-square)](https://github.com/etcd-io/etcd/releases)
[![LICENSE](https://img.shields.io/github/license/etcd-io/etcd.svg?style=flat-square)](https://github.com/etcd-io/etcd/blob/main/LICENSE)
[![OpenSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/etcd-io/etcd/badge)](https://api.securityscorecards.dev/projects/github.com/etcd-io/etcd)
<a href="https://actuated.dev/"><img alt="Arm CI sponsored by Actuated" src="https://docs.actuated.dev/images/actuated-badge.png" width="120px"></img></a>
**Note**: The `main` branch may be in an *unstable or even broken state* during development. For stable versions, see [releases][github-release].

View File

@ -549,9 +549,12 @@ func (l *lessor) recvKeepAlive(resp *pb.LeaseKeepAliveResponse) {
// deadlineLoop reaps any keep alive channels that have not received a response
// within the lease TTL
func (l *lessor) deadlineLoop() {
timer := time.NewTimer(time.Second)
defer timer.Stop()
for {
timer.Reset(time.Second)
select {
case <-time.After(time.Second):
case <-timer.C:
case <-l.donec:
return
}

View File

@ -2,6 +2,8 @@
Scripts and files which may be useful but aren't part of the core etcd project.
* [systemd](systemd) - an example unit file for deploying etcd on systemd-based distributions
* [lock](lock) - example addressing the expired lease problem of distributed locking with etcd
* [mixin](mixin) - customisable set of Grafana dashboard and Prometheus alerts for etcd
* [raftexample](raftexample) - an example distributed key-value store using raft
* [systemd](systemd) - an example unit file for deploying etcd on systemd-based distributions
* [systemd/etcd3-multinode](systemd/etcd3-multinode) - multi-node cluster setup with systemd

2
go.mod
View File

@ -34,7 +34,7 @@ require (
go.etcd.io/raft/v3 v3.0.0-20231012085229-7c3ed830bbb0
go.uber.org/zap v1.26.0
golang.org/x/time v0.5.0
google.golang.org/grpc v1.59.0
google.golang.org/grpc v1.60.0
google.golang.org/protobuf v1.31.0
)

8
go.sum
View File

@ -236,8 +236,8 @@ golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8T
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
google.golang.org/appengine v1.1.0/go.mod h1:EbEs0AVv82hx2wNQdGPgUI5lhzA/G0D9YwlJXL52JkM=
google.golang.org/appengine v1.4.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4=
google.golang.org/appengine v1.6.7 h1:FZR1q0exgwxzPzp/aF+VccGrSfxfPpkBqjIIEq3ru6c=
google.golang.org/appengine v1.6.7/go.mod h1:8WjMMxjGQR8xUklV/ARdw2HLXBOI7O7uCIDZVag1xfc=
google.golang.org/appengine v1.6.8 h1:IhEN5q69dyKagZPYMSdIjS2HqprW324FRQZJcGqPAsM=
google.golang.org/appengine v1.6.8/go.mod h1:1jJ3jBArFh5pcgW8gCtRJnepW8FzD1V44FJffLiz/Ds=
google.golang.org/genproto v0.0.0-20180817151627-c66870c02cf8/go.mod h1:JiN7NxoALGmiZfu7CAH4rXhgtRTLTxftemlI0sWmxmc=
google.golang.org/genproto v0.0.0-20190819201941-24fa4b261c55/go.mod h1:DMBHOl98Agz4BDEuKkezgsaosCRResVns1a3J2ZsMNc=
google.golang.org/genproto v0.0.0-20200423170343-7949de9c1215/go.mod h1:55QSHmfGQM9UVYDPBsyGGes0y52j32PQ3BqQfXhyH3c=
@ -252,8 +252,8 @@ google.golang.org/grpc v1.23.0/go.mod h1:Y5yQAOtifL1yxbo5wqy6BxZv8vAUGQwXBOALyac
google.golang.org/grpc v1.25.1/go.mod h1:c3i+UQWmh7LiEpx4sFZnkU36qjEYZ0imhYfXVyQciAY=
google.golang.org/grpc v1.27.0/go.mod h1:qbnxyOmOxrQa7FizSgH+ReBfzJrCY1pSN7KXBS8abTk=
google.golang.org/grpc v1.29.1/go.mod h1:itym6AZVZYACWQqET3MqgPpjcuV5QH3BxFS3IjizoKk=
google.golang.org/grpc v1.59.0 h1:Z5Iec2pjwb+LEOqzpB2MR12/eKFhDPhuqW91O+4bwUk=
google.golang.org/grpc v1.59.0/go.mod h1:aUPDwccQo6OTjy7Hct4AfBPD1GptF4fyUjIkQ9YtF98=
google.golang.org/grpc v1.60.0 h1:6FQAR0kM31P6MRdeluor2w2gPaS4SVNrD/DNTxrQ15k=
google.golang.org/grpc v1.60.0/go.mod h1:OlCHIeLYqSSsLi6i49B5QGdzaMZK9+M7LXN2FKz4eGM=
google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp09yW+WbY/TyQbw=
google.golang.org/protobuf v1.26.0/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQnmE0givc=
google.golang.org/protobuf v1.31.0 h1:g0LDEJHgrBl9N9r17Ru3sqWhkIx2NB67okBHPwC7hs8=

View File

@ -90,18 +90,22 @@ func bootstrap(cfg config.ServerConfig) (b *bootstrappedServer, err error) {
bwal = bootstrapWALFromSnapshot(cfg, backend.snapshot)
}
cfg.Logger.Info("bootstrapping cluster")
cluster, err := bootstrapCluster(cfg, bwal, prt)
if err != nil {
backend.Close()
return nil, err
}
cfg.Logger.Info("bootstrapping storage")
s := bootstrapStorage(cfg, st, backend, bwal, cluster)
if err = cluster.Finalize(cfg, s); err != nil {
backend.Close()
return nil, err
}
cfg.Logger.Info("bootstrapping raft")
raft := bootstrapRaft(cfg, cluster, s.wal)
return &bootstrappedServer{
prt: prt,

View File

@ -298,8 +298,10 @@ type EtcdServer struct {
func NewServer(cfg config.ServerConfig) (srv *EtcdServer, err error) {
b, err := bootstrap(cfg)
if err != nil {
cfg.Logger.Error("bootstrap failed", zap.Error(err))
return nil, err
}
cfg.Logger.Info("bootstrap successfully")
defer func() {
if err != nil {
@ -392,8 +394,15 @@ func NewServer(cfg config.ServerConfig) (srv *EtcdServer, err error) {
if srv.Cfg.EnableLeaseCheckpoint {
// setting checkpointer enables lease checkpoint feature.
srv.lessor.SetCheckpointer(func(ctx context.Context, cp *pb.LeaseCheckpointRequest) {
srv.lessor.SetCheckpointer(func(ctx context.Context, cp *pb.LeaseCheckpointRequest) error {
if !srv.ensureLeadership() {
srv.lg.Warn("Ignore the checkpoint request because current member isn't a leader",
zap.Uint64("local-member-id", uint64(srv.MemberId())))
return lease.ErrNotPrimary
}
srv.raftRequestOnce(ctx, pb.InternalRaftRequest{LeaseCheckpoint: cp})
return nil
})
}
@ -842,7 +851,19 @@ func (s *EtcdServer) run() {
func (s *EtcdServer) revokeExpiredLeases(leases []*lease.Lease) {
s.GoAttach(func() {
// We shouldn't revoke any leases if current member isn't a leader,
// because the operation should only be performed by the leader. When
// the leader gets blocked on the raft loop, such as writing WAL entries,
// it can't process any events or messages from raft. It may think it
// is still the leader even the leader has already changed.
// Refer to https://github.com/etcd-io/etcd/issues/15247
lg := s.Logger()
if !s.ensureLeadership() {
lg.Warn("Ignore the lease revoking request because current member isn't a leader",
zap.Uint64("local-member-id", uint64(s.MemberId())))
return
}
// Increases throughput of expired leases deletion process through parallelization
c := make(chan struct{}, maxPendingRevokes)
for _, curLease := range leases {
@ -875,6 +896,29 @@ func (s *EtcdServer) revokeExpiredLeases(leases []*lease.Lease) {
})
}
// ensureLeadership checks whether current member is still the leader.
func (s *EtcdServer) ensureLeadership() bool {
lg := s.Logger()
ctx, cancel := context.WithTimeout(s.ctx, s.Cfg.ReqTimeout())
defer cancel()
if err := s.linearizableReadNotify(ctx); err != nil {
lg.Warn("Failed to check current member's leadership",
zap.Error(err))
return false
}
newLeaderId := s.raftStatus().Lead
if newLeaderId != uint64(s.MemberId()) {
lg.Warn("Current member isn't a leader",
zap.Uint64("local-member-id", uint64(s.MemberId())),
zap.Uint64("new-lead", newLeaderId))
return false
}
return true
}
// Cleanup removes allocated objects by EtcdServer.NewServer in
// situation that EtcdServer::Start was not called (that takes care of cleanup).
func (s *EtcdServer) Cleanup() {
@ -1975,7 +2019,9 @@ func removeNeedlessRangeReqs(txn *pb.TxnRequest) {
// applyConfChange applies a ConfChange to the server. It is only
// invoked with a ConfChange that has already passed through Raft
func (s *EtcdServer) applyConfChange(cc raftpb.ConfChange, confState *raftpb.ConfState, shouldApplyV3 membership.ShouldApplyV3) (bool, error) {
lg := s.Logger()
if err := s.cluster.ValidateConfigurationChange(cc); err != nil {
lg.Error("Validation on configuration change failed", zap.Bool("shouldApplyV3", bool(shouldApplyV3)), zap.Error(err))
cc.NodeID = raft.None
s.r.ApplyConfChange(cc)
@ -1988,7 +2034,6 @@ func (s *EtcdServer) applyConfChange(cc raftpb.ConfChange, confState *raftpb.Con
return false, err
}
lg := s.Logger()
*confState = *s.r.ApplyConfChange(cc)
s.beHooks.SetConfState(confState)
switch cc.Type {

View File

@ -278,6 +278,16 @@ func (s *EtcdServer) LeaseRevoke(ctx context.Context, r *pb.LeaseRevokeRequest)
func (s *EtcdServer) LeaseRenew(ctx context.Context, id lease.LeaseID) (int64, error) {
if s.isLeader() {
// If s.isLeader() returns true, but we fail to ensure the current
// member's leadership, there are a couple of possibilities:
// 1. current member gets stuck on writing WAL entries;
// 2. current member is in network isolation status;
// 3. current member isn't a leader anymore (possibly due to #1 above).
// In such case, we just return error to client, so that the client can
// switch to another member to continue the lease keep-alive operation.
if !s.ensureLeadership() {
return -1, lease.ErrNotPrimary
}
if err := s.waitAppliedIndex(); err != nil {
return 0, err
}

View File

@ -75,7 +75,7 @@ type RangeDeleter func() TxnDelete
// Checkpointer permits checkpointing of lease remaining TTLs to the consensus log. Defined here to
// avoid circular dependency with mvcc.
type Checkpointer func(ctx context.Context, lc *pb.LeaseCheckpointRequest)
type Checkpointer func(ctx context.Context, lc *pb.LeaseCheckpointRequest) error
type LeaseID int64
@ -422,7 +422,9 @@ func (le *lessor) Renew(id LeaseID) (int64, error) {
// By applying a RAFT entry only when the remainingTTL is already set, we limit the number
// of RAFT entries written per lease to a max of 2 per checkpoint interval.
if clearRemainingTTL {
le.cp(context.Background(), &pb.LeaseCheckpointRequest{Checkpoints: []*pb.LeaseCheckpoint{{ID: int64(l.ID), Remaining_TTL: 0}}})
if err := le.cp(context.Background(), &pb.LeaseCheckpointRequest{Checkpoints: []*pb.LeaseCheckpoint{{ID: int64(l.ID), Remaining_TTL: 0}}}); err != nil {
return -1, err
}
}
le.mu.Lock()
@ -656,7 +658,9 @@ func (le *lessor) checkpointScheduledLeases() {
le.mu.Unlock()
if len(cps) != 0 {
le.cp(context.Background(), &pb.LeaseCheckpointRequest{Checkpoints: cps})
if err := le.cp(context.Background(), &pb.LeaseCheckpointRequest{Checkpoints: cps}); err != nil {
return
}
}
if len(cps) < maxLeaseCheckpointBatchSize {
return

View File

@ -269,10 +269,11 @@ func TestLessorRenewWithCheckpointer(t *testing.T) {
defer os.RemoveAll(dir)
le := newLessor(lg, be, clusterLatest(), LessorConfig{MinLeaseTTL: minLeaseTTL})
fakerCheckerpointer := func(ctx context.Context, cp *pb.LeaseCheckpointRequest) {
fakerCheckerpointer := func(ctx context.Context, cp *pb.LeaseCheckpointRequest) error {
for _, cp := range cp.GetCheckpoints() {
le.Checkpoint(LeaseID(cp.GetID()), cp.GetRemaining_TTL())
}
return nil
}
defer le.Stop()
// Set checkpointer
@ -556,7 +557,7 @@ func TestLessorCheckpointScheduling(t *testing.T) {
defer le.Stop()
le.minLeaseTTL = 1
checkpointedC := make(chan struct{})
le.SetCheckpointer(func(ctx context.Context, lc *pb.LeaseCheckpointRequest) {
le.SetCheckpointer(func(ctx context.Context, lc *pb.LeaseCheckpointRequest) error {
close(checkpointedC)
if len(lc.Checkpoints) != 1 {
t.Errorf("expected 1 checkpoint but got %d", len(lc.Checkpoints))
@ -565,6 +566,7 @@ func TestLessorCheckpointScheduling(t *testing.T) {
if c.Remaining_TTL != 1 {
t.Errorf("expected checkpoint to be called with Remaining_TTL=%d but got %d", 1, c.Remaining_TTL)
}
return nil
})
_, err := le.Grant(1, 2)
if err != nil {

View File

@ -218,15 +218,10 @@ func TestPeriodicCheckDetectsCorruption(t *testing.T) {
assert.NoError(t, err, "error on put")
}
members, err := cc.MemberList(ctx, false)
memberID, found, err := getMemberIdByName(ctx, cc, epc.Procs[0].Config().Name)
assert.NoError(t, err, "error on member list")
var memberID uint64
for _, m := range members.Members {
if m.Name == epc.Procs[0].Config().Name {
memberID = m.ID
}
}
assert.NotZero(t, memberID, "member not found")
assert.Equal(t, found, true, "member not found")
epc.Procs[0].Stop()
err = testutil.CorruptBBolt(datadir.ToBackendFileName(epc.Procs[0].Config().DataDirPath))
assert.NoError(t, err)
@ -263,14 +258,9 @@ func TestCompactHashCheckDetectCorruption(t *testing.T) {
err = cc.Put(ctx, testutil.PickKey(int64(i)), fmt.Sprint(i), config.PutOptions{})
assert.NoError(t, err, "error on put")
}
members, err := cc.MemberList(ctx, false)
memberID, found, err := getMemberIdByName(ctx, cc, epc.Procs[0].Config().Name)
assert.NoError(t, err, "error on member list")
var memberID uint64
for _, m := range members.Members {
if m.Name == epc.Procs[0].Config().Name {
memberID = m.ID
}
}
assert.Equal(t, found, true, "member not found")
epc.Procs[0].Stop()
err = testutil.CorruptBBolt(datadir.ToBackendFileName(epc.Procs[0].Config().DataDirPath))

View File

@ -0,0 +1,99 @@
// Copyright 2023 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//go:build !cluster_proxy
package e2e
import (
"context"
"math/rand"
"os"
"strings"
"testing"
"time"
"github.com/stretchr/testify/require"
"go.etcd.io/etcd/server/v3/etcdserver"
"go.etcd.io/etcd/tests/v3/framework/e2e"
"go.etcd.io/etcd/tests/v3/framework/testutils"
)
func TestMemberReplace(t *testing.T) {
e2e.BeforeTest(t)
ctx, cancel := context.WithTimeout(context.Background(), 20*time.Second)
defer cancel()
epc, err := e2e.NewEtcdProcessCluster(ctx, t)
require.NoError(t, err)
defer epc.Close()
memberIdx := rand.Int() % len(epc.Procs)
member := epc.Procs[memberIdx]
memberName := member.Config().Name
var endpoints []string
for i := 1; i < len(epc.Procs); i++ {
endpoints = append(endpoints, epc.Procs[(memberIdx+i)%len(epc.Procs)].EndpointsGRPC()...)
}
cc, err := e2e.NewEtcdctl(epc.Cfg.Client, endpoints)
require.NoError(t, err)
memberID, found, err := getMemberIdByName(ctx, cc, memberName)
require.NoError(t, err)
require.Equal(t, found, true, "Member not found")
// Need to wait health interval for cluster to accept member changes
time.Sleep(etcdserver.HealthInterval)
t.Logf("Removing member %s", memberName)
_, err = cc.MemberRemove(ctx, memberID)
require.NoError(t, err)
_, found, err = getMemberIdByName(ctx, cc, memberName)
require.NoError(t, err)
require.Equal(t, found, false, "Expected member to be removed")
for member.IsRunning() {
err = member.Wait(ctx)
if err != nil && !strings.Contains(err.Error(), "unexpected exit code") {
t.Fatalf("member didn't exit as expected: %v", err)
}
}
t.Logf("Removing member %s data", memberName)
err = os.RemoveAll(member.Config().DataDirPath)
require.NoError(t, err)
t.Logf("Adding member %s back", memberName)
removedMemberPeerUrl := member.Config().PeerURL.String()
_, err = cc.MemberAdd(ctx, memberName, []string{removedMemberPeerUrl})
require.NoError(t, err)
err = patchArgs(member.Config().Args, "initial-cluster-state", "existing")
require.NoError(t, err)
// Sleep 100ms to bypass the known issue https://github.com/etcd-io/etcd/issues/16687.
time.Sleep(100 * time.Millisecond)
t.Logf("Starting member %s", memberName)
err = member.Start(ctx)
require.NoError(t, err)
testutils.ExecuteUntil(ctx, t, func() {
for {
_, found, err := getMemberIdByName(ctx, cc, memberName)
if err != nil || !found {
time.Sleep(10 * time.Millisecond)
continue
}
break
}
})
}

View File

@ -122,3 +122,26 @@ func runCommandAndReadJsonOutput(args []string) (map[string]any, error) {
return resp, nil
}
func getMemberIdByName(ctx context.Context, c *e2e.EtcdctlV3, name string) (id uint64, found bool, err error) {
resp, err := c.MemberList(ctx, false)
if err != nil {
return 0, false, err
}
for _, member := range resp.Members {
if name == member.Name {
return member.ID, true, nil
}
}
return 0, false, nil
}
func patchArgs(args []string, flag, newValue string) error {
for i, arg := range args {
if strings.Contains(arg, flag) {
args[i] = fmt.Sprintf("--%s=%s", flag, newValue)
return nil
}
}
return fmt.Errorf("--%s flag not found", flag)
}

View File

@ -0,0 +1,157 @@
// Copyright 2023 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//go:build !cluster_proxy
package e2e
import (
"context"
"testing"
"time"
"github.com/stretchr/testify/require"
clientv3 "go.etcd.io/etcd/client/v3"
"go.etcd.io/etcd/tests/v3/framework/e2e"
"go.etcd.io/etcd/tests/v3/framework/testutils"
)
// TestLeaseRevoke_IgnoreOldLeader verifies that leases shouldn't be revoked
// by old leader.
// See the case 1 in https://github.com/etcd-io/etcd/issues/15247#issuecomment-1777862093.
func TestLeaseRevoke_IgnoreOldLeader(t *testing.T) {
testLeaseRevokeIssue(t, true)
}
// TestLeaseRevoke_ClientSwitchToOtherMember verifies that leases shouldn't
// be revoked by new leader.
// See the case 2 in https://github.com/etcd-io/etcd/issues/15247#issuecomment-1777862093.
func TestLeaseRevoke_ClientSwitchToOtherMember(t *testing.T) {
testLeaseRevokeIssue(t, false)
}
func testLeaseRevokeIssue(t *testing.T, connectToOneFollower bool) {
e2e.BeforeTest(t)
ctx := context.Background()
t.Log("Starting a new etcd cluster")
epc, err := e2e.NewEtcdProcessCluster(ctx, t,
e2e.WithClusterSize(3),
e2e.WithGoFailEnabled(true),
e2e.WithGoFailClientTimeout(40*time.Second),
)
require.NoError(t, err)
defer func() {
if errC := epc.Close(); errC != nil {
t.Fatalf("error closing etcd processes (%v)", errC)
}
}()
leaderIdx := epc.WaitLeader(t)
t.Logf("Leader index: %d", leaderIdx)
epsForNormalOperations := epc.Procs[(leaderIdx+2)%3].EndpointsGRPC()
t.Logf("Creating a client for normal operations: %v", epsForNormalOperations)
client, err := clientv3.New(clientv3.Config{Endpoints: epsForNormalOperations, DialTimeout: 3 * time.Second})
require.NoError(t, err)
defer client.Close()
var epsForLeaseKeepAlive []string
if connectToOneFollower {
epsForLeaseKeepAlive = epc.Procs[(leaderIdx+1)%3].EndpointsGRPC()
} else {
epsForLeaseKeepAlive = epc.EndpointsGRPC()
}
t.Logf("Creating a client for the leaseKeepAlive operation: %v", epsForLeaseKeepAlive)
clientForKeepAlive, err := clientv3.New(clientv3.Config{Endpoints: epsForLeaseKeepAlive, DialTimeout: 3 * time.Second})
require.NoError(t, err)
defer clientForKeepAlive.Close()
resp, err := client.Status(ctx, epsForNormalOperations[0])
require.NoError(t, err)
oldLeaderId := resp.Leader
t.Log("Creating a new lease")
leaseRsp, err := client.Grant(ctx, 20)
require.NoError(t, err)
t.Log("Starting a goroutine to keep alive the lease")
doneC := make(chan struct{})
stopC := make(chan struct{})
startC := make(chan struct{}, 1)
go func() {
defer close(doneC)
respC, kerr := clientForKeepAlive.KeepAlive(ctx, leaseRsp.ID)
require.NoError(t, kerr)
// ensure we have received the first response from the server
<-respC
startC <- struct{}{}
for {
select {
case <-stopC:
return
case <-respC:
}
}
}()
t.Log("Wait for the keepAlive goroutine to get started")
<-startC
t.Log("Trigger the failpoint to simulate stalled writing")
err = epc.Procs[leaderIdx].Failpoints().SetupHTTP(ctx, "raftBeforeSave", `sleep("30s")`)
require.NoError(t, err)
cctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
t.Logf("Waiting for a new leader to be elected, old leader index: %d, old leader ID: %d", leaderIdx, oldLeaderId)
testutils.ExecuteUntil(cctx, t, func() {
for {
resp, err = client.Status(ctx, epsForNormalOperations[0])
if err == nil && resp.Leader != oldLeaderId {
t.Logf("A new leader has already been elected, new leader index: %d", resp.Leader)
return
}
time.Sleep(100 * time.Millisecond)
}
})
cancel()
t.Log("Writing a key/value pair")
_, err = client.Put(ctx, "foo", "bar")
require.NoError(t, err)
t.Log("Sleeping 30 seconds")
time.Sleep(30 * time.Second)
t.Log("Remove the failpoint 'raftBeforeSave'")
err = epc.Procs[leaderIdx].Failpoints().DeactivateHTTP(ctx, "raftBeforeSave")
require.NoError(t, err)
// By default, etcd tries to revoke leases every 7 seconds.
t.Log("Sleeping 10 seconds")
time.Sleep(10 * time.Second)
t.Log("Confirming the lease isn't revoked")
leases, err := client.Leases(ctx)
require.NoError(t, err)
require.Equal(t, 1, len(leases.Leases))
t.Log("Waiting for the keepAlive goroutine to exit")
close(stopC)
<-doneC
}

View File

@ -79,13 +79,10 @@ func (logOb *LogObserver) ExpectFunc(ctx context.Context, filter func(string) bo
}
if len(res) >= count {
break
return res, nil
}
}
if len(res) >= count {
return res, nil
}
time.Sleep(10 * time.Millisecond)
}
}

View File

@ -350,7 +350,9 @@ func putAndWatch(t *testing.T, wctx *watchctx, key, val string) {
}
}
func TestWatchResumeInitRev(t *testing.T) {
// TestWatchResumeAfterDisconnect tests watch resume after member disconnects then connects.
// It ensures that correct events are returned corresponding to the start revision.
func TestWatchResumeAfterDisconnect(t *testing.T) {
integration2.BeforeTest(t)
clus := integration2.NewCluster(t, &integration2.ClusterConfig{Size: 1, UseBridge: true})
defer clus.Terminate(t)
@ -367,7 +369,10 @@ func TestWatchResumeInitRev(t *testing.T) {
t.Fatal(err)
}
// watch from revision 1
wch := clus.Client(0).Watch(context.Background(), "a", clientv3.WithRev(1), clientv3.WithCreatedNotify())
// response for the create watch request, no events are in this response
// the current revision of etcd should be 4
if resp, ok := <-wch; !ok || resp.Header.Revision != 4 {
t.Fatalf("got (%v, %v), expected create notification rev=4", resp, ok)
}
@ -389,12 +394,16 @@ func TestWatchResumeInitRev(t *testing.T) {
if !ok {
t.Fatal("unexpected watch close")
}
if len(resp.Events) == 0 {
t.Fatal("expected event on watch")
// Events should be put(a, 3) and put(a, 4)
if len(resp.Events) != 2 {
t.Fatal("expected two events on watch")
}
if string(resp.Events[0].Kv.Value) != "3" {
t.Fatalf("expected value=3, got event %+v", resp.Events[0])
}
if string(resp.Events[1].Kv.Value) != "4" {
t.Fatalf("expected value=4, got event %+v", resp.Events[1])
}
case <-time.After(5 * time.Second):
t.Fatal("watch timed out")
}

View File

@ -747,7 +747,7 @@ func TestV3LeaseFailover(t *testing.T) {
// send keep alive to old leader until the old leader starts
// to drop lease request.
var expectedExp time.Time
expectedExp := time.Now().Add(5 * time.Second)
for {
if err = lac.Send(lreq); err != nil {
break

View File

@ -52,7 +52,7 @@ gofail-disable: install-gofail
.PHONY: install-gofail
install-gofail:
cd tools/mod; go install go.etcd.io/gofail@${GOFAIL_VERSION}
go install go.etcd.io/gofail@${GOFAIL_VERSION}
# Build previous releases for robustness tests

View File

@ -158,7 +158,7 @@ require (
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/polyfloyd/go-errorlint v1.4.5 // indirect
github.com/prometheus/client_golang v1.17.0 // indirect
github.com/prometheus/client_model v0.4.1-0.20230718164431-9a2bf3000d16 // indirect
github.com/prometheus/client_model v0.5.0 // indirect
github.com/prometheus/common v0.44.0 // indirect
github.com/prometheus/procfs v0.11.1 // indirect
github.com/quasilyte/go-ruleguard v0.4.0 // indirect

View File

@ -532,8 +532,8 @@ github.com/prometheus/client_model v0.0.0-20180712105110-5c3871d89910/go.mod h1:
github.com/prometheus/client_model v0.0.0-20190129233127-fd36f4220a90/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
github.com/prometheus/client_model v0.2.0/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
github.com/prometheus/client_model v0.4.1-0.20230718164431-9a2bf3000d16 h1:v7DLqVdK4VrYkVD5diGdl4sxJurKJEMnODWRJlxV9oM=
github.com/prometheus/client_model v0.4.1-0.20230718164431-9a2bf3000d16/go.mod h1:oMQmHW1/JoDwqLtg57MGgP/Fb1CJEYF2imWWhWtMkYU=
github.com/prometheus/client_model v0.5.0 h1:VQw1hfvPvk3Uv6Qf29VrPF32JB6rtbgI6cYPYQjL0Qw=
github.com/prometheus/client_model v0.5.0/go.mod h1:dTiFglRmd66nLR9Pv9f0mZi7B7fk5Pm3gvsjB5tr+kI=
github.com/prometheus/common v0.4.1/go.mod h1:TNfzLD0ON7rHzMJeJkieUDPYmFC7Snx/y86RQel1bk4=
github.com/prometheus/common v0.10.0/go.mod h1:Tlit/dnDKsSWFlCLTWaA1cyBgKHSMdTB80sz/V91rCo=
github.com/prometheus/common v0.26.0/go.mod h1:M7rCNAaPfAosfx8veZJCuw84e35h3Cfd9VFqTh1DIvc=