13 Nov 2013

Microsoft hit by second Office 365 email outage in five days

Posted by iwgcr

On November 13th, some Microsoft Office 365 customers in North America were reporting (via Twitter and email) that they were experiencing email problems — just like they were five days before.

A Microsoft spokesperson provided the following update around 2 pm ET November 13th:

“On Tuesday, Nov. 13, some customers served from our North America data centers are experiencing intermittent access to e-mail services. Customers are being updated regularly via our normal communication channels. We sincerely apologize to our customers for any inconvenience.”

The problem was solved the same day and an update was posted on Office 365’s blog:

From 9:08AM to 2:10PM PST today, November 13th, some customers in North and South America were unable to access email services.  The service incident resulted from a combination of issues related to maintenance, network element failures, and increased load on the service.  This morning, the Office 365 team was performing planned non-impacting network maintenance by shifting some load out of the datacenters under maintenance.  In combination with this standard process, we experienced a ‘gray’ failure of some active network elements; the elements failed, but did not alert us to their failure.  Additionally, we have an increasing load of customers on-boarding to the service.  These three issues in combination caused customer access to email services to be degraded for an extended period of time.  By 10:42am PST, remediation work was underway to balance users to healthy sites, broaden the service access points and remediate the failed network devices.  At 2:10PM PST all services were fully restored.  Significant capacity increase has already been well underway, but we are also adding automated handling on these gray failures to speed recovery time.  Across the organization, we are executing a full review of our processes to proactively identify further actions needed to avoid these situations.”




Critical Data Lost

2013-11-13 Microsoft Office 365 5 hours 2 minutes no