Commit Graph

324 Commits

Author SHA1 Message Date
2de17bd396 deflake: TestDowngradeCancellationAfterDowngrading1InClusterOf3
Fixes: 65159a2b96 (*: Update cases related to Downgrade)

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-02-22 08:22:17 -05:00
65159a2b96 *: Update cases related to Downgrade
1. Update DowngradeUpgradeMembersByID

If it's downgrading process, the desire version of cluster should be
target one.
If it's upgrading process, the desire version of cluster should be
determined by mininum binary version of members.

2. Remove AssertProcessLogs from DowngradeEnable

The log message "The server is ready to downgrade" appears only when the storage
version monitor detects a mismatch between the cluster and storage versions.

If traffic is insufficient to trigger a commit or if an auto-commit occurs right
after reading the storage version, the monitor may fail to update it, leading
to errors like:

```bash
"msg":"failed to update storage version","cluster-version":"3.6.0",
"error":"cannot detect storage schema version: missing confstate information"
```

Given this, we should remove the AssertProcessLogs statement.

Similar to #19313

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-02-21 13:45:41 -05:00
1709422e21 Remove passing of anonymous visualize function
Signed-off-by: shashwat-jain <shashwat.jain@salesforce.com>
2025-02-12 11:26:10 +05:30
34546e9aa0 Add more debug info into waitTillSnapshot
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2025-02-07 12:14:49 +00:00
8866fce73f test: update robustness doc and new case to reproduce 19179
NOTE: It's still hard to reproduce it. It may take 5 minutes or few
hours.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-02-06 10:24:52 -05:00
3b3243bbb9 increase timeout for MemberDowngradeUpgrade test
Signed-off-by: Gang Li <ganglica@google.com>
2025-02-05 23:58:46 +00:00
2bddcd4801 Merge pull request #19317 from siyuanfoundation/downgrade-robust-1
Remove some HealthInterval to reduce the time to run DowngradeUpgradeMembers
2025-02-03 10:22:08 +01:00
136dfbe5b5 *: introduce (*Op) Limit() interface for robustness
Since #19137, kubernetes traffic profile is unable to send List request
with page size, because limit in option is not accessable. To fix it,
this fix is to introduce Limit() interface.

Fixes: #19292

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-02-02 22:16:22 -05:00
6b0a9cd763 Remove some HealthInterval to reduce the time to run DowngradeUpgradeMembers.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2025-02-02 14:54:29 -08:00
e757a45d1d Merge pull request #19269 from fuweid/test-robustness-list
tests/robustness: continue should ignore last key
2025-01-27 12:53:25 +01:00
f98fa31fa0 Merge pull request #19255 from AwesomePatrol/add-version-to-robustness-model
Add Version field to the robustness model
2025-01-27 12:07:12 +01:00
564e362408 tests/robustness: continue should ignore last key
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-01-24 18:55:06 -05:00
e289ba3078 Merge pull request #19265 from redwrasse/redwrasse/rm-random-seed-robustness
Remove explicit random seed in robustness tests
2025-01-24 10:52:53 +01:00
43c6316420 robustness: remove explicit random seed in robustness tests.
Signed-off-by: redwrasse <mail@redwrasse.io>
2025-01-23 16:02:02 -08:00
b16b8dc6f3 migrate flag experimental-watch-progress-notify-interval to use watch-progress-notify-interval (#19248)
migrate flag experimental-watch-progress-notify-interval to use watch-progress-notify-interval

Signed-off-by: Gang Li <gangligit@gmail.com>
2025-01-23 21:08:30 +00:00
3dbe62fb02 Add Version field to the robustness model
It helps perform Txn before Compact in the same way
as it is done in Kubernetes.

Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>
2025-01-23 14:11:23 +01:00
845a330e46 Implement Kubernetes like compaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2025-01-21 12:03:04 +01:00
5ccbeec769 Merge pull request #19218 from serathius/fix-selecting-experimental-flag
Fix passing compaction-batch-limit to etcd v3.4 and v3.5
2025-01-20 18:45:38 +01:00
8c989a1e37 Fix passing compaction-batch-limit to etcd v3.4 and v3.5
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2025-01-20 17:14:21 +01:00
10d7cea552 chore: enable early-return and superfluous-else from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

Co-authored-by: Iván Valdés Castillo <iv@nvald.es>
2025-01-18 09:44:58 +01:00
9da01a8275 Merge pull request #19196 from gangli113/main
migrate flag experimental-compaction-batch-limit to use compaction-batch-limit
2025-01-17 07:43:22 +00:00
76aadeff38 Merge pull request #19169 from jmao-dd/jmao/robusttest-18089
Add Robustness test to reproduce issue 18089
2025-01-16 16:08:07 +01:00
7590a7ebae tests: Reproduce #18089 in robustness tests
1) Use SleepBeforeSendWatchResponse failpoint to simulate slow watch
2) Decrease compact period from 200ms to 100ms to increase the probability of compacting on Delete
3) Introduce a new traffic pattern of 50/50 Put and Delete

With these three changes the `make test-robustness-issue18089` command can reproduce issue 18089.

Signed-off-by: Jiayin Mao <jiayin.mao@datadoghq.com>
2025-01-16 12:49:05 +00:00
ce4b4e533d Merge pull request #19125 from siyuanfoundation/downgrade-robust-2
add MemberDowngradeUpgrade failpoint
2025-01-15 10:30:00 +01:00
8f51613574 add MemberDowngradeUpgrade failpoint
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2025-01-15 01:06:39 +00:00
33d65fc90b TestConfigFileDeprecatedOptions
Signed-off-by: Gang Li <gangligit@gmail.com>
2025-01-14 16:26:50 -08:00
d9d60be322 tests/robustness/traffic: should use rev=0 for create
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-01-14 14:08:30 -05:00
abf65dde01 Merge pull request #19176 from jmao-dd/jmao/increase-limiter-burst
tests: use high burst value in limiter.
2025-01-13 10:20:55 +01:00
a03bdaabd2 Merge pull request #19175 from jmao-dd/jmao/fix-typo
tests: fix wrong number in comment.
2025-01-13 10:05:18 +01:00
47bd258fc8 tests: use high burst value in limiter.
Use the highest MaximalQPS of all traffic profiles as burst otherwise actual traffic may be accidentally limited.
For example, if burst is set to 200 it is unlikely traffic can achieve higher than 200.

Signed-off-by: Jiayin Mao <jiayin.mao@datadoghq.com>
2025-01-11 21:15:32 +00:00
5702d87501 tests: fix wrong number in comment.
The expected sum of weights should be 100, not 1000.

Signed-off-by: Jiayin Mao <jiayin.mao@datadoghq.com>
2025-01-11 21:07:44 +00:00
08e4d6d9c2 robustness: only run MemberDowngrade test for high SnapshotCatchUpEntries.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2025-01-09 16:12:02 -08:00
3d562c357f Merge pull request #19137 from AwesomePatrol/use-new-interface-in-robustness-tests
Use new interface in robustness tests
2025-01-09 11:27:20 +01:00
00e5b654a1 Merge pull request #19095 from ahrtr/wal_20241221
Still return continuous WAL entries when running into ErrSliceOutOfRange
2025-01-08 14:14:19 +00:00
8335e70304 Migrate all OldTxn calls to Txn
Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>
2025-01-08 11:48:26 +01:00
7c7d3ce8ae Migrate all OldGet calls to Get
Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>
2025-01-08 11:48:26 +01:00
524afd20d1 Set GetOnFailure=true to better simulate k8s traffic
Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>
2025-01-08 11:48:26 +01:00
3247bc42bb Make traffic robustness test use kubernetes.Interface
Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>
2025-01-08 11:48:26 +01:00
747ef5f5fe Add MemberDowngrade failpoint
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-12-23 13:08:13 -08:00
152de1fa7e Still return continuous WAL entries when running into ErrSliceOutOfRange
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2024-12-21 15:33:14 +00:00
647f1621d6 fix: enable gosimple linter
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-12-03 07:32:22 +01:00
f2472d4b80 Handle non-linearized MemberList in v3.4 for robustness tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-12-02 14:32:38 +01:00
75cd1369a5 fix: enable gofumpt instead of gofmt linter in tests
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-11-28 21:08:42 +01:00
0fcf6e80b4 Merge pull request #18928 from serathius/robustness-qps-150
Reduce QPS requirement to 100
2024-11-24 23:16:46 -07:00
fe45307dd4 Merge pull request #18905 from serathius/robustness-duplicated-puts-3
Robustness duplicated puts 3
2024-11-24 11:20:29 +01:00
b85c6ba7b4 Reduce QPS requirement to 100
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-11-20 20:49:56 +01:00
14f4df4019 Merge pull request #18930 from serathius/robustness-jitter
Add jitter to failpoint injection to cover periodily executed compaction
2024-11-20 20:47:03 +01:00
d961c81813 Add jitter to failpoint injection to cover periodily executed compaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-11-20 19:58:01 +01:00
3d33c09c46 Multiply return time by 100 in tests to detect off by one differences
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-11-17 21:55:30 +01:00
668834b7df Allow duplicated put requests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-11-17 21:51:45 +01:00