
Sui Mainnet First Outage Summary
TechFlow Selected TechFlow Selected

Sui Mainnet First Outage Summary
When the issue occurred, the Sui engineering team quickly diagnosed the problem and released a fix, which was then deployed by validators to minimize network downtime.

Incident Overview
Between approximately 1:15 AM and 3:45 AM Pacific Time on November 21, 2024 (5:15 PM to 7:45 PM UTC+8), the Sui mainnet experienced a complete network outage. All validator nodes entered a crash loop, resulting in a total halt of transaction processing.
Root Cause
An assert! in the throttling control code was triggered: if the estimated execution cost was zero, it caused validator nodes to crash. This issue occurred only when all of the following conditions were met:
1. Throttling control was set to TotalGasBudgetWithCap mode:
-
This mode was briefly enabled in protocol version 63, then reverted, and later re-enabled in protocol version 68 with the cumulative scheduler.
2. The network received a transaction meeting both of the following criteria:
-
Contains mutable shared object inputs
-
Contains zero MoveCall instructions
Upon receiving such a transaction, all validator nodes crashed immediately.
What Is Throttling Control?
Sui's object-based architecture enables massive parallelization of user transactions—a capability not feasible on most other networks. However, when multiple transactions simultaneously write to the same shared object, those transactions must be executed sequentially, and there is an upper limit on the throughput for transactions involving that specific object.
The throttling control system prevents network overload by limiting the rate at which transactions can write to the same shared object, avoiding excessively long checkpoint execution times.
We recently upgraded the throttling control system to improve shared object utilization by more accurately estimating transaction complexity. However, a bug in the new TotalGasBudgetWithCap mode led to this incident.
How Was It Fixed?
Once identified, the fix was straightforward (see PR #20365). The patch has been deployed to both mainnet (v1.37.4) and testnet (v1.38.1).
PR #20365: Modified bump_object_execution_cost to use saturating addition and allow zero-cost transactions.
🌟 Mainnet v1.37.4:
https://github.com/MystenLabs/sui/releases
Thanks to the prompt response from the validator community, the Sui network resumed normal operations within just 15 minutes of the fix being released.
Key Takeaways
-
Incident detection and response systems worked well: Automated alerts and community reports triggered nearly simultaneously, enabling us to quickly mobilize teams for diagnosis and resolution.
-
Validator community performed exceptionally: The Sui network recovered almost instantly after the fix was deployed.
Preventive Measures
-
Improve testing systems: Add more adversarial transaction types similar to the one that triggered this crash, to uncover potential vulnerabilities earlier.
-
Optimize build processes: Accelerate the generation of debug and release binaries to further reduce incident response time. Part of the downtime during this incident was due to waiting for release builds.
Thank you to the community and validators for your support in ensuring Sui’s rapid recovery!
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














