10
Jan
2014
Posted by iwgcr. No Comments
The file storage and sharing site went offline on Friday, January 10 2014 and continued to suffer problems even after returning to life over the weekend. The remaining issues were corrected and the core service was restored as of Sunday 4:40 PM PT, according to Dropbox. For example, one issue prevented Dropbox users from sharing folders.
“On Friday at 5:30 PM PT, we had a planned maintenance scheduled to upgrade the OS on some of our machines,” Dropbox’s head of infrastructure, Akhil Gupta, said in a blog posted on Sunday. “During this process, the upgrade script checks to make sure there is no active data on the machine before installing the new OS. A subtle bug in the script caused the command to reinstall a small number of active machines. Unfortunately, some master-slave pairs were impacted, which resulted in the site going down.”
Date
|
Service
|
Duration
|
Critical Data Lost
|
2014-1-10
|
Dropbox
|
47 hours
|
no
|
Resources:
http://techcrunch.com/2014/01/12/dropbox-outage-internal-maintenance/
http://news.cnet.com/8301-1023_3-57617116-93/dropbox-outage-sparked-by-buggy-server-upgrade/
6
Jan
2014
Posted by iwgcr. No Comments
Telstra has confirmed that the management console for its corporate cloud platform went offline for some of its customers for two days starting January 6 2014, in the second demonstration in less than a year that the company’s cloud computing environment may not yet be as stable as the company would like customers to believe.
Telco’s Cloud Services Management Console had been intermittently experiencing major outages since the morning of 6 January. The issue did not cause a loss of service for the virtual machines and other cloud infrastructure which Telstra runs for customers in its datacenters, but did block access to management utilities and details about those machines, as well as the ability to purchase new or redistribute existing services.
The issue was serious enough that Telstra’s senior executive team, including the company’s chief executive David Thodey, was being updated regularly on the issue. Telstra is believed to have called in external assistance from at least one of its vendor suppliers on the issue, believed.
Telstra issued the following statement on the issue: “There were no connectivity issues to virtual servers or internet connections, but some Cloud customers may have had difficulty accessing the Cloud Services management console to view or modify services. The matter was identified on 6 Jan and rectified yesterday (8 Jan) evening. Overnight monitoring and subsequent checks have confirmed functionality has been fully restored. The incident has not affected our 99.9 per cent platform uptime commitment as customers could still access cloud environments throughout this time.”
Date
|
Service
|
Duration
|
Critical Data Lost
|
2014-1-6
|
Telstra cloud services
|
48 hours
|
no
|
Resources:
http://delimiter.com.au/2014/01/13/telstra-cloud-console-goes-offline-two-days/
6
Jan
2014
Posted by iwgcr. No Comments
Starting around 12:30 a.m. Jan. 6, several AWS customers found their services disrupted, starting at first with laggy performance and glitches before the services went offline.
By 2:45 a.m., more than two hours later, the services started coming back online, although it still took some time for Amazon to get all of the cloud services and applications back online.
But according to Amazon — and confirmed by its Service Health Dashboard – the public cloud leader did not suffer an outage, but instead the inability for users to connect to various AWS-hosted cloud services was due to an east coast Internet service provider experiencing a cut fiber line on January 2.
Date
|
Service
|
Duration
|
Critical Data Lost
|
2014-1-6 |
AWS |
2 hours 15 minutes |
no |
Resources:
http://talkincloud.com/public-cloud/010714/east-coast-blizzard-takes-down-amazon-customers
31
Dec
2013
Posted by iwgcr. No Comments
Endurance International Group brands Bluehost, HostMonster, HostGator, and JustHost have ended 2013 with a massive outage affecting customers worldwide.
The outage struck Bluehost, Hostgator, Hostmonster and Justhost around 1:37 EST and the impact was felt around the world. Thousands of customers have been without access to their emails or websites. Customers have taken to Twitter to express their outrage with Bluehost and the other hosting companies that are experiencing an outage.
Bluehost released a statement earlier in the day stating, “Our Provo UT data center is currently experiencing some outbound network traffic connectivity issues. Admins are on site and looking into the issue.”
Finally, the services were restored at 4:50 pm ET Services for the majority of customers, according to EIG. However, no official statement was issued on what caused this outage.
Date
|
Service
|
Duration
|
Critical Data Lost
|
2013-12-31 |
HostGator, Bluehost, HostMonster, JustHost |
3 hours 13 minutes |
no |
Resources:
http://dailyglobe.com/34425/bluehost-hostgator-hit-new-years-eve-outage/
http://www.thewhir.com/web-hosting-news/bluehost-hostgator-among-eig-brands-hit-massive-new-years-eve-outage
http://americanlivewire.com/2013-12-31-2013-12-31-bluehost-and-hostgator-crash/
26
Dec
2013
Posted by iwgcr. No Comments
The website of major Australian retailer Myer was brought down for over a week during the post-Christmas sales due to a “communication breakdown”.
“An IBM team of local and global experts worked around the clock with Myer to resolve the issue with its online store. The technical issue was caused by a communication breakdown between internet servers and a software application,” according to an IBM spokesperson.
“IBM and Myer will work together to conduct a thorough review to ensure this issue does not reoccur. IBM is committed to supporting Myer to continue to provide high-quality service to its customers.”
After advertising “Australia’s biggest stocktake sale” in the lead up to Boxing Day, visitors to its website on December 26 discovered that the website had been taken down, with the company saying on its Facebook page that it was “experiencing some technical difficulties” on its site.
It wasn’t until January 2, a full week later, that the website was back up and running. The retail giant said it would be monitoring the site and offering free shipping as compensation for failing to fulfil its advertised availability during the Christmas period.
However, many customers have continued to be unable to shop online, with Myer stating as late as Sunday that people were “still experiencing difficulties with the site”.
Brookes placed the monetary loss on the sales at “less than 1 percent of our business, which translates to AU$31 million”.
Date
|
Service
|
Duration
|
Critical Data Lost
|
2013-12-26 |
IBM |
1 week |
no |
Resources:
http://www.zdnet.com/myer-website-crash-due-to-communication-breakdown-ibm-7000024799/
12
Dec
2013
Posted by iwgcr. No Comments
9
Dec
2013
Posted by iwgcr. No Comments
On the 9th of December 2013 a lot of Yahoo Mail users started complaining about issues when trying to access the service. Soon after that, the company confirmed that one of its Mail servers was acting up, and that the issue had proved to be much harder to fix than anticipated. Yahoo assured users that it had teams working on it round the clock to bring everything back up. The Mail issues seem to have been taken care, and with all that done and dusted, Yahoo CEO Marissa Mayer took to the company’s official blog to apologize for the prolonged outage.
She also detailed exactly what had happened. Mayer writes that there was a specific hardware outage in one of Yahoo’s storage systems that serves 1 percent of its users, but the “problem was a particularly rare one.” Since different users were affected in different ways, the resolution for affected accounts was nuanced. However during the week, Yahoo’s team worked “around the clock” to restore access as well as all messages to Mail inboxes. IMAP access was also improved for users who use clients such as Apple Mail or Outlook to access Yahoo Mail. Since then, Mayer says, access has been restored to “almost everyone” and the backlog of messages has been delivered.
Date
|
Service
|
Duration
|
Critical Data Lost
|
2013-12-09 |
Yahoo mail service |
3 days |
no |
Resources:
http://www.ubergizmo.com/2013/12/yahoo-ceo-marissa-mayer-apologizes-for-mail-outage/
http://news.cnet.com/8301-10811_3-57615298/hardware-problem-causes-partial-outage-for-yahoo-mail/
25
Nov
2013
Posted by iwgcr. No Comments
Between 1:51 PM and 2:37 PM PST, Amazon experienced increased error rates and latencies for the EC2 APIs in the US-EAST-1 Region.
Date
|
Service
|
Duration
|
Critical Data Lost
|
2013-11-25
|
Amazon Cloud
|
46 minutes
|
no
|
Reference:
http://status.aws.amazon.com/
21
Nov
2013
Posted by iwgcr. No Comments
Between 8:10 PM and 10:25 PM PST some instances experienced connectivity issues in a single Availability Zone in the US-WEST-2 Region. The issue has been resolved and the service is operating normally.
Date
|
Service
|
Duration
|
Critical Data Lost
|
2013-11-21
|
Amazon Elastic Compute Cloud (Oregon)
|
2 hours 15 minutes
|
no
|
References:
http://status.aws.amazon.com/
21
Nov
2013
Posted by iwgcr. No Comments
A handful of Microsoft’s online services, including Xbox.com, appeared to be experiencing a global outage on Thursday, November 21st afternoon.
The outage, which Twitter users began reporting around 3 p.m., also affected Outlook.com and Office365.com, other online services hosted by Microsoft’s Azure cloud computing service. Error messages generated during attempts to visit the sites indicated it might be a DNS issue:
“Service Unavailable – DNS failure
The server is temporarily unable to service your request. Please try again later.
Reference #11.83b302cc.1385076207.1118e9”
The degree to which sites were affected appears to be varied. At one point Thursday afternoon, every region around the world was experiencing “a full service interruption” to its storage services, according to Microsoft’s Windows Azure Services Dashboard. But a later update on the dashboard indicated that the services were back to normal operation in each region.
The outage to Xbox.com comes just hours before the midnight launch of the long-awaited Xbox One, the $499 successor to the Xbox 360 gaming consol.
The issue was solved at 4 p.m.
Date
|
Service
|
Duration
|
Critical Data Lost
|
2013-11-21 |
Windows Azure |
1 hour |
no |
Resource:
http://news.cnet.com/8301-10805_3-57613414-75/xbox.com-other-microsoft-sites-hit-by-service-disruption/