Fastly says single customer triggered bug behind mass internet outage

Voiced by Amazon Polly

Flaw was introduced in May and lay dormant until a customer updated their settings, firm says

The blackout left some of the world’s biggest websites offline for a period on Tuesday.
The blackout left some of the world’s biggest websites offline for a period on Tuesday. Photograph: Pavel Kapish/Alamy

Alex Hern Technology editor@alexhernWed 9 Jun 2021 15.02 BST

An internet blackout that knocked out some of the world’s biggest websites on Tuesday was ultimately caused by a single customer updating their settings, the infrastructure provider Fastly has revealed.

A bug in Fastly’s code introduced in mid-May had lain dormant until Tuesday morning, according to Nick Rockwell, the company’s head of engineering and infrastructure. When the unnamed customer updated their settings, it triggered the flaw, which ultimately took down 85% of the company’s network.

“On May 12, we began a software deployment that introduced a bug that could be triggered by a specific customer configuration under specific circumstances,” Rockwell said. “Early June 8, a customer pushed a valid configuration change that included the specific circumstances that triggered the bug, which caused 85% of our network to return errors.

“We detected the disruption within one minute, then identified and isolated the cause, and disabled the configuration. Within 49 minutes, 95% of our network was operating as normal.”

Rockwell added: “Even though there were specific conditions that triggered this outage, we should have anticipated it. We provide mission-critical services, and we treat any action that can cause service issues with the utmost sensitivity and priority. We apologize to our customers and those who rely on them for the outage and sincerely thank the community for its support.”

The content delivery network (CDN) operated by Fastly is one of the largest on the internet, along with similar networks operated by Akamai, Cloudflare and Amazon’s CloudFront. All operate on the same principle: that the internet is faster and more stable if users can connect to servers physically close to them, optimised for handling lots of traffic.

In typical times, doing so not only cuts loading times but also allows the CDN operators, with expertise in running internet infrastructure, to take on the burden of handling security threats, unexpected traffic spikes, and high bandwidth bills. But the outage highlighted the risks associated with a concentration of critical internet infrastructure in the hands of just a few companies.

Counterintuitively, the outage and recovery led to a rise in Fastly’s stock price, which was up 12% over the course of Tuesday. The increase may have been because the company had demonstrated an effective incident response plan, or simply because the outage had served to make investors more aware of the scale of the Fastly’s business and the size of its customer base.Internet outage illustrates lack of resilience at heart of critical services