Commit Graph

47 Commits

Author SHA1 Message Date
2de17bd396 deflake: TestDowngradeCancellationAfterDowngrading1InClusterOf3
Fixes: 65159a2b96 (*: Update cases related to Downgrade)

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-02-22 08:22:17 -05:00
65159a2b96 *: Update cases related to Downgrade
1. Update DowngradeUpgradeMembersByID

If it's downgrading process, the desire version of cluster should be
target one.
If it's upgrading process, the desire version of cluster should be
determined by mininum binary version of members.

2. Remove AssertProcessLogs from DowngradeEnable

The log message "The server is ready to downgrade" appears only when the storage
version monitor detects a mismatch between the cluster and storage versions.

If traffic is insufficient to trigger a commit or if an auto-commit occurs right
after reading the storage version, the monitor may fail to update it, leading
to errors like:

```bash
"msg":"failed to update storage version","cluster-version":"3.6.0",
"error":"cannot detect storage schema version: missing confstate information"
```

Given this, we should remove the AssertProcessLogs statement.

Similar to #19313

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-02-21 13:45:41 -05:00
34546e9aa0 Add more debug info into waitTillSnapshot
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2025-02-07 12:14:49 +00:00
3b3243bbb9 increase timeout for MemberDowngradeUpgrade test
Signed-off-by: Gang Li <ganglica@google.com>
2025-02-05 23:58:46 +00:00
6b0a9cd763 Remove some HealthInterval to reduce the time to run DowngradeUpgradeMembers.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2025-02-02 14:54:29 -08:00
8c989a1e37 Fix passing compaction-batch-limit to etcd v3.4 and v3.5
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2025-01-20 17:14:21 +01:00
9da01a8275 Merge pull request #19196 from gangli113/main
migrate flag experimental-compaction-batch-limit to use compaction-batch-limit
2025-01-17 07:43:22 +00:00
8f51613574 add MemberDowngradeUpgrade failpoint
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2025-01-15 01:06:39 +00:00
33d65fc90b TestConfigFileDeprecatedOptions
Signed-off-by: Gang Li <gangligit@gmail.com>
2025-01-14 16:26:50 -08:00
08e4d6d9c2 robustness: only run MemberDowngrade test for high SnapshotCatchUpEntries.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2025-01-09 16:12:02 -08:00
7c7d3ce8ae Migrate all OldGet calls to Get
Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>
2025-01-08 11:48:26 +01:00
3247bc42bb Make traffic robustness test use kubernetes.Interface
Signed-off-by: Aleksander Mistewicz <amistewicz@google.com>
2025-01-08 11:48:26 +01:00
747ef5f5fe Add MemberDowngrade failpoint
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-12-23 13:08:13 -08:00
f2472d4b80 Handle non-linearized MemberList in v3.4 for robustness tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-12-02 14:32:38 +01:00
75cd1369a5 fix: enable gofumpt instead of gofmt linter in tests
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-11-28 21:08:42 +01:00
fdfb5c0851 tests: add robustness test for issue 17780
Closes #17780

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-11-12 19:08:02 +00:00
e5a63483ff fix: enable errorlint linter
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-10-31 18:54:59 +01:00
4017ebaed6 fix: use require instead of t.Fatal(err) in tests/robustness package
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-10-26 08:15:37 +02:00
105782f95b Remove brackets from failpoint name
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-10-23 20:54:38 +02:00
2e04ee77b6 Avoid sending Compact request when LazyFS is enabled
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-18 08:36:24 +02:00
5e42ed9b22 Reproduce issue #17529
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-15 19:40:23 +02:00
2c56e8edc1 Merge pull request #18107 from serathius/e2e-error-log
Improve e2e error reporting
2024-06-07 13:58:59 +02:00
5959110f4a Implement Compaction support in robustness test
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-07 10:33:57 +02:00
3c5684967f Improve e2e error reporting
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Co-authored-by: James Blair <mail@jamesblair.net>
Co-authored-by: chao <chaochn@amazon.com>
2024-06-07 10:24:52 +02:00
b8eeaacbcb Ignore connection reset error when triggering a failpoint
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-05 17:58:46 +02:00
1d367fbae6 Merge pull request #17923 from siyuanfoundation/robust
Add randomness in robustness cluster process version to test mixed version scenarios.
2024-05-22 14:07:22 +02:00
0f94c2ca4f robustness: add mix version scenario with fixed leader.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-05-21 17:42:12 +00:00
b54d7552a7 robustness: add mix version option in exploratoryScenarios.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-05-21 16:57:53 +00:00
3fb36d9ae2 Allow gofail trigger to fail as long as the member stops running
This is required for compaction based failpoint, to allow the traffic
send compaction request causing etcd to crash before failpoint executes
the trigger.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-21 18:46:35 +02:00
d8bb19327b Prevent picking a failpoint that waiting till snapshot that doesn't support lower snapshot catchup entries but allow reproducing issue #15271
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-13 12:08:42 +02:00
b8ffc5e8c0 Merge pull request #17967 from serathius/robustness-update-readme
Update the robustness README and fix the #14370 reproduction case
2024-05-09 10:05:27 +02:00
be9758e2bc Update the robustness README and fix the #14370 reproduction case
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-08 11:31:28 +02:00
c4e3b61a1c Record operation from failpoint injection
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-01 19:20:22 +02:00
f285330d46 Don't require minimal for failpoint injection period
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-20 10:34:51 +02:00
3a23994fbf Make no failpoint error more readable
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-07 15:13:59 +02:00
0976398964 tests/robustness: address golangci var-naming issues
Signed-off-by: Ivan Valdes <ivan@vald.es>
2024-03-25 16:27:05 -07:00
3471ef133d Add an e2e test and robustness failpoint around recovering from snapshot backend
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-01-04 15:25:24 +01:00
5175652a8e Abort if failpoint injecton failed
If one of nodes is unhealthy the test would never finish as watchers
would never reach max revision.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-12-03 17:26:51 +01:00
55516234d3 exclude sleep failpoint from 1 node scenario
Signed-off-by: ZhouJianMS <zhoujian@microsoft.com>
2023-11-13 16:19:44 +08:00
d208985aec error handling for gofailpoint
Signed-off-by: ZhouJianMS <zhoujian@microsoft.com>
2023-11-03 19:25:17 +08:00
827dc18682 Add IO stall failpoint in raft loop
Signed-off-by: ZhouJianMS <zhoujian@microsoft.com>
2023-11-03 16:42:33 +08:00
45fb4565e3 Merge pull request #16786 from serathius/robustness-drop-packet
Implement random packet dropping
2023-10-19 08:44:23 +02:00
aa28a69ce0 Implement random packet dropping
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-18 10:14:43 +02:00
aea1cd0077 feat: enable unparam lint
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-10-17 21:24:13 +08:00
7e8bb15ccb Add member replace failpoint to robustness tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-17 11:17:49 +02:00
0d83a72cf5 Split failpoints file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-17 09:51:43 +02:00
d6e376b6c6 Move failpoints to separate package
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-16 20:57:31 +02:00