Despite having access to the greatest resources and dominating the attention of not only the corporate community but also governments, Facebook was down for 6 hours with its famous billions of users on three primary vanity platforms. Aside from the occasional comical picture opportunity for a worried government official, few organizations were privy to see Facebook's failure on a scale that warranted concern.
For those unencumbered with engineering details but are still bearing managerial responsibilities, the event warrants a pause for reflection.
Cloudflare, the owner of the IP 18.104.22.168, which is responsible for converting URLs (Internet addresses generally written with alphanumeric characters) to IPs, was one of these entities, and its engineers have explained what they observed on that day eloquently.
To begin illustrating the situation Cloudflare observed, let us use a real world metaphor. If there had been an Internet of Public Transportation, each city would maintain a record of its own transit routes, as well as all of the places that these routes may reach, addresses.
Each city would retain a list of addresses (not people's identities) that it knew how to reach and would advertise it to neighbouring cities. Let us now envision a few white pages around the globe that would keep track of people's names and their addresses.
Consider a society in which people are in the habit of visiting other people's houses hundreds of times every day. Each individual would reveal the name of the person they want to visit to a white page service which would either know where that person lived or have a list of somebody who did. This must happen fast, as moving from one address to another generally takes a few tens of milliseconds.
Facebook and its companies are not a single person's home, but rather a large country with more than 2 billion people.
Why would any business that relies heavily on digital technologies disregard the importance of safeguarding their infrastructure?
Facebook abruptly stopped providing lists of addresses it knows about itself and about its neighbours. The names of those who live on Facebook eventually faded out of the white pages.
Visitors still knew the names of the persons they wished to see and continued to inquire about them. In fact, the visitors' anxiety increased as a result of their inability to locate these folks, prompting them to inquire about them more frequently.
This generated congestion for the world's white pages and slowed response times for individuals who are not Facebook users. Smaller platforms like Twitter also felt the problem.
I make no claim to understand what occurred on Facebook, and all I know is what a white page provider informed me.
As a managed service provider, all I know is that if the mighty, and rather evil, Facebook needed 5.5 hours to recover despite access to top talent, technology, and resources that make up entire countries' treasuries suffered such a nightmare; why would any technology-driven business not allocate the necessary resources to safeguard its infrastructure?
Setting-up and running infrastructure is not a trivial task and disruptions will not only cause you to lose revenue for the duration of the service interruption.
As Facebook's WhatsApp was down, the traffic to its alternatives like Signal and Telegram shot up. You risk your infrastructure, you risk sending your clients elsewhere to satisfy their needs.
Making sure that your infrastructure is not the easy target for hackers means your prized data assets are not as likely to end-up for sale on the dark web or with you locked out of these assets pending a transfer of bitcoin to some nefarious underground ransomware outfit.
Yet many resort to flying blind and, often, straight into disaster.
In Arabic, we have a saying, leave the bread baking to the baker even if the baker eats half of it. The simple facts are that infrastructure specialists can offer advice on multiple Cloud Service Providers, tools to use and the right types of building blocks.
From hard-earned experience, not only can they save you a lot on your monthly bill but they can build it right from the first go, increase the frequency of your update allowing you the agility to test different mixes of products, messages and deliveries, saving your resources to focus on better serving your customers.
Facebook's nightmare need not happen to your business; they had the resources to survive this, your business may not? You need a team of reliable and ongoing DevOps support to safeguard your digital growth going forward.
Contact us for a free consultation on how you can boost your digital business.
We are only interested in designing + building ground breaking experiences!
We don’t care if you are a global enterprise or a startup… if you have the real ambition to grow, we’d like to talk to you!