Xbox Live issues: lessons to be learned?

We don’t cover Xbox Live enough here on LiveSide, but we’ve been following the news of some problems in accessing Xbox Live over the holidays.  In a nutshell, apparently Xbox Live couldn’t keep up with unprecedented traffic, and connections have been spotty.  From an email from Marc Whitten, General Manager of Xbox Live:

During this past holiday season you helped us break a number of Xbox LIVE records.  This included our largest sign-up of new members to Xbox LIVE in our 5 year history and just yesterday you broke the record for the single biggest day of concurrent members ever on the service.

As a result of this massive increase in usage we know that some of you experienced intermittent Xbox LIVE issues over the holiday break.  While the service was not completely offline at any given time, we are disappointed in our performance.  I would like to take this moment to thank each and every one of you for your patience and understanding as our team has worked around the clock to return the service to a stable state.

While we won’t comment on what some consider performance bad enough to sue over, we think there are some lessons to be learned from the episode:

  • Service outages are probably inevitable
    • The infrastructure for providing services is not yet in place, although it is being built rapidly
    • Sudden popularity, while it is the holy grail of internet services, is also potentially a killer
    • Even a small disruption in a large and popular service will not only be noticed but emphasized, as blogs and news services look for stories
  • Open lines of communication are essential to mitigating problems
    • Larry Hyrb (Major Nelson), called by Reuters the "face of Xbox",  acknowledged the problems early, and often, both on his blog and on Twitter
    • Marc Whitten’s announcement both officially acknowledged and clarified the situation
    • Both Whitten’s and Hyrb’s posts gave users a chance to vent, if nothing else
  • Just fixing the problem might not be enough
    • Reputation for excellence is essential in providing popular live services, without it, actual performance might not matter
    • Acknowledging that "we are disappointed in our performance", and offering compensation, are efforts to maintain a good reputation

Of course the best solution to problems such as these is to make sure they don’t happen in the first place.  However disruptions in service are bound to happen, even with the best of safeguards in place.  Services providers, not only outside the company but within Microsoft, should take a hard look at the work Larry Hyrb has been doing to maintain open lines of communication.  From the Reuters article:

Hryb is their Walter Cronkite — someone who gamers can turn to for the straight story on all things Xbox.

"His blog gets hit up pretty substantially. He’s kind of delivering the information that gamers are usually left in the dark about, so users really enjoy that," said Erik Brudvig, Xbox editor of gaming Web site

A long-time gamer and former programmer with radio broadcaster Clear Channel Communications, Hryb’s media output combines elements of TV news reports, video game fan sites, corporate press releases and customer support.

He estimates he gets 500 e-mails every day, and last year he posted 1,550 times on Web messaging service Twitter to distribute news and information as fast as it comes in. But that does not mean he is short on facts or data. His year-end podcast, for instance, ran three hours.

"The news cycle is not monthly, it’s not weekly, it’s daily and frankly it’s hourly sometimes. Blogging came from setting straight some misinformation that was out there," Hryb said.

Larry Hyrb has taken upon himself to position himself as "someone who gamers can turn to".  As Microsoft expands its software plus services platform, will there be a Major Nelson to turn to when (not if) there are problems with SkyDrive, or Office Live Workspaces, or other services as yet on the horizon?