Facebook’s daylong malfunction is a reminder of the Internet’s fragility
  • Thursday, May 23, 2019
  • 73°
News

Facebook’s daylong malfunction is a reminder of the Internet’s fragility

  • NEW YORK TIMES / 2018

    Visitors stop to take photos of the sign at the Facebook campus in Menlo Park, Calif. Facebook said today that it had repaired a technical error that led to long lapses in service at its various properties, including Instagram, WhatsApp and Messenger.

SAN FRANCISCO >> Facebook said today that it had repaired a technical error that led to long lapses in service at its various properties, including Instagram, WhatsApp and Messenger.

The interruption lasted nearly 24 hours on some of the services and was the longest in Facebook’s recent history. It was an eye-opening reminder that even the most powerful internet companies, employing the best computer scientists and cutting-edge technology, can still be crippled by human error.

“All of the big web companies have multiple lines of defense, but sometimes a coding mistake made by one engineer can make its way onto many thousands of computers and cause major errors,” said Alex Stamos, a former chief security officer at Facebook and a lecturer at Stanford University. “In other words, rebooting something as complex as Facebook is very, very hard.”

A “server configuration change” made Wednesday had a cascading effect through the company’s network, a Facebook spokesman said. That created a repeating loop of problems that kept growing and could not be immediately fixed, according to one current and one former Facebook employee, who spoke on the condition of anonymity because they were not allowed to talk to reporters.

That small mistake had big consequences. Instagram users couldn’t view other profiles, WhatsApp users couldn’t send messages, and news feeds across Facebook’s main app went blank.

DownDetector, which likens itself to a weather report for the internet, said it had received 7.5 million problem reports about Facebook’s apps. In comparison, widespread problems on YouTube in October prompted just 2.7 million reports. DownDetector measures service interruptions in part by counting reports from users who are experiencing problems.

“Never before have we such a large-scale outage,” said Tom Sanders, a co-founder of DownDetector.

Early today, Facebook was able to pull most of its systems back online. The company is still trying to figure how that error reverberated throughout its network. Facebook officials emphasized that the problem had not been caused by hacking or a cyberassault like a so-called denial-of-service attack, which would hit servers with a wave of traffic that caused them to stop working.

Facebook, like other internet giants, prides itself on never going offline. That predictability has helped it become one of the most influential — and criticized — companies in the world. An estimated 2 billion-plus people use one or several of its services daily.

Comments (1)

By participating in online discussions you acknowledge that you have agreed to the Terms of Service. An insightful discussion of ideas and viewpoints is encouraged, but comments must be civil and in good taste, with no personal attacks. If your comments are inappropriate, you may be banned from posting. Report comments if you believe they do not follow our guidelines.

Having trouble with comments? Learn more here.

Scroll Up