The BlackBerry outage – How Critical Was the Failure?

The BlackBerry outage – How Critical Was the Failure?

blog
Was it just a few years ago that BlackBerry got the kind of endorsement that marketers dream at night of?  Then-candidate for President Barack Obama revealed that even he had fallen for the allure of the “Crackberry.”  Estimates are that the company made another $25-$50 million just on that word alone.  There had to have been some legendary parties when that news came out.

Since then the BlackBerry’s fortunes haven’t been much better than Obama’s.  It was already struggling to maintain relevance before this week.  Then the Great BlackBerry Outage hit.  Word started coming in from Europe, the Middle East and Africa that users’ messages and emails weren’t going out.   Research In Motion (RIM), makers of the BlackBerry OS, acknowledged the problem and pinpointed its source, but were not able to stop users in Canada and US from getting hit next as well.

What got hit hardest, though, was the reputation of the BlackBerry itself.  Now with a tablet and smart phone world that is bursting with alternatives, the question comes: is it time to move on from the BlackBerry, or is this just a series of nasty bumps that loyal users will be rewarded for seeing them through?

Putting the outage into context

This wasn’t the greatest year for BlackBerry to start out.  What was once their market to lose is now the market they’re starting to lose.  While 2010 sales were strong, the iPad series alone seized a sizable chunk of the BlackBerry’s user share.  These devices brought with them a parade of competitors that further squeezed the former king of portable devices into a defensive corner.

To try to keep up with the iJones’s, BlackBerry is attempting to reposition itself with its own Playbook tablet.  Supplanting the BlackBerry OS in the new device will be QNX, a Unix-based operating system that BlackBerry says will take over its handheld market as well as of the BlackBerry 8.  RIM is supporting all of these changes with a developer conference next week.  A feeling of “betting the farm” is setting in with all of this, and with some of the accompanying headlines, there is a growing concern that this may not be a winning bet.

The technical side

Let’s get away from economic commentary and examine the actual problem.  Not much information has been released just yet.  What we do know is that within RIM’s infrastructure a core switch failed.  This happens all the time: what usually doesn’t happen is redundant backup systems not kicking in like expected.  RIM routes all of its customer traffic through BlackBerry servers.  This centralization provides a high level of security, one of the BlackBerry line’s main selling points.  You don’t exactly want the president’s emails peeked at.   It also creates a single point of failure.  Even with this single point having its own backups within it, it proved to be too much of a technical bottleneck.

Once the breakdown occurred, a spectacular backlog of data started to pile up.  It’s believed that this is why the problem eventually spread to the Americas, though no solid information has confirmed this.  As of this writing the problem appears to be subsiding, but this is also not confirmed.  Research In Motion co-CEO Mike Lazaridis posted a 2-minute video apologizing for the outage.  He explained in it that he wished he had more information, but that he would continue to communicate as the problem got resolved.  It was a honest, honorable response to the situation that offered no excuses, but that simultaneously confirmed that the problem was serious.

A reminder about vulnerability

There is one quick lesson we can get from this incident.  Without making any judgments on how understandable an error this was, we can perhaps take this as a reminder that even today, with every attempt in the world to provide security and redundancy, outages can and will still happen.

Companies today like to present an air of invulnerability.   We’ve warned you in these columns, though, that nature remains more powerful than us.  Every system has its weaknesses.  This isn’t to completely exonerate RIM.  It may well be when the dust settles that some truly boneheaded errors will be revealed.  Clearly there were some mistakes, somewhere.  Regardless of this, the greater truth is that some errors will always occur.  Take this lesson from the outage: it is a fantasy to think that any company can ever engineer its way past all of life’s chaos.

Once you’ve accepted that, we can then approach the question of what to do about it in this case.  That is not as easy a question to answer.

Wait for the dice to settle?

In the world of fantasy games there is a notion called the “critical failure”.  It occurs when, during a battle, a combatant has a total defensive breakdown, and suffers a blow that is so damaging that it alone could wipe them out.

Has the BlackBerry suffered that through this outage?  It could hardly have come at a worse time for them.  Right now more than ever they are attempting to prove that they still have a place in the mobile world.  Little worse evidence to the contrary could have been offered.  All that said, we cannot say for certain, looking at the efforts RIM is putting in to continuing to maintain their product line that they are necessarily down for the count.  It’s too soon to say what the overall effect of this incident will be.

What we can say, though, is that if you are not already a BlackBerry customer, it’s wise to exercise a bit of caution before necessarily jumping in.  The entire BlackBerry line is in a significant state of flux right now.  There is talk about management changes, technical changes, and market changes.  It is impossible right now to make any long-term predictions about anything related to the entire BlackBerry line with any real clarity.

If you still love them enough to stay with them or are willing to take chances with a product line that could go either way no one would blame you.  It is still a quality set of hardware.  No would blame you either, though, for stepping back, and looking before leaping.  The BlackBerry line right now could go either way.

This outage was a clear failure.  We’ll probably need to watch throughout the rest of 2011 to know for certain whether or not this was a critical failure