Skip to main content

Cloud Outages

What is a Cloud Computing? Excerpt from Wikipedia. "Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet)".

And it was marketed to provide 24x7 and 99.9999% up time. This will never be true, here's why.
  • Cloud hardware's are managed by system engineers.
  • Including the software maintenance, updates, and applying security patches are managed by system engineers. The system  engineers can be a programmers or software architect, or other related title in information technology depending where you work.
  • Documentations and Procedures. Have you read an IT procedures? They are detailed procedures, and very long. It will take some time to finish perusing a documents.
  • Managed by Human Being
And here are some recent outages by Microsoft Office365, Amazon Web Services, and GoDaddy. I only read a statement from Scott Wagner CEO, GoDaddy, but nothing from Amazon Web Services or Microsoft Office365 (outage Nov.13, 2012)

From GoDaddy website: statement related to GoDaddy outage last September 10, 2012.
"Go Daddy Site Outage Investigation Completed Yesterday, GoDaddy.com and many of our customers experienced intermittent service outages starting shortly after 10 a.m. PDT. Service was fully restored by 4 p.m. PDT.

The service outage was not caused by external influences. It was not a "hack" and it was not a denial of service attack (DDoS). We have determined the service outage was due to a series of internal network events that corrupted router data tables. Once the issues were identified, we took corrective actions to restore services for our customers and GoDaddy.com. We have implemented measures to prevent this from occurring again. 

At no time was any customer data at risk or were any of our systems compromised.

Throughout our history, we have provided 99.999% uptime in our DNS infrastructure. This is the level our customers expect from us and the level we expect of ourselves. We have let our customers down and we know it.

We take our business and our customers' businesses very seriously. We apologize to our customers for these events and thank them for their patience.

 - Scott Wagner Go Daddy CEO

TechCrunch wrote this article related to GoDaddy outage.

Amazon Web Services outage last October 22, 2012. One of our sister project Friended.com is using Amazon Web Services (AWS) Elastic Cloud Computing (EC2). The pingdom online monitoring recorded a 4 hours and 19 minutes of downtime. It's not pretty to look at the monitoring tool during that day.

Read the "Amazon Web Services outage once again shows reality behind "the cloud". An excerpt from Arstechnica.com "Amazon's Elastic Block Store ("EBS") service, an underpinning component of Amazon's extremely popular Elastic Compute Cloud ("EC2"), experienced a substantial service interruption this afternoon. Amazon EC2 has become such a ubiquitous feature in the cloud computing landscape that it's difficult to throw a rock without hitting a large company with a public Web offering that uses it. So today's service interruption bit deeply: among the sites knocked partially or totally offline were redditImgur, and developer favorite Heroku.

The article also point out much more serious EC2 outage last April 2011.

References:
http://ya.net/friended-com-outage/
Arstechnica.com

And today Nov. 13, 2012 Microsoft Office365 is down for more than 5 hours, some organization reported wasting a day of work.

TechCrunch.com didn't blog any news related to the Office365 outage today, though I can see that post something for "Microsoft brings Internet Explorer 10 preview to Windows 7 PCs". Maybe they are not using Office365. And maybe I should start my own blog dedicated to "Cyber Down", why not! :)

The update from Microsoft Service Health, started the outage at 12:08pm to 5:44pm. Here's the details from  the service health website.

Nov 13, 2012 5:44 PM Service restored 
Closure Summary: On 11/13/2012 at approximately 5:00 PM UTC, an issue in one of Microsoft's datacenters caused users to encounter errors when accessing their email. The problem affected users in North America and Latin America. Microsoft engineers identified the root cause of the issue and initiated a failover to restore service. A complete post-incident report will be available on the Service Health Dashboard within five business days.

Nov 13, 2012 4:53 PM Restoring service The last few steps to restore the service are still in progress. We will resolve this incident as soon as we have verified availablity.

Nov 13, 2012 3:56 PM Restoring service Service restoration in the last two sites is still in progress.

Nov 13, 2012 2:59 PM Restoring service Service should be restored for most customers right now. in the impacted forests. We continue to work on restoring service for all customers as quickly as possible.

Nov 13, 2012 2:37 PM Restoring service We are still working to restore service health. We will provide additional information when it becomes available.

Nov 13, 2012 1:42 PM Restoring service We are resolving the service incident and working to restore service health. We will provide additional information when it becomes available.

Nov 13, 2012 1:18 PM Service interruption A few users are unable to access their email at this time.

Nov 13, 2012 12:55 PM Service interruption A few users are unable to access their email at this time.

Nov 13, 2012 12:08 PM Investigating We are investigating a potential issue. At this time we dont have enough information to identify whether this is an actual service incident. We will provide more information shortly.

If you post a blog related to Cloud services outages, please post it here.
Thank you.

Comments

  1. Here’s the update from Microsoft Office365 recent outages (Nov.8 and Nov.13, 2012).
    http://blogs.office.com/b/microsoft_office_365_blog/archive/2012/11/13/update-on-recent-customer-issues.aspx

    ReplyDelete

Post a Comment

Popular posts from this blog

Alternative Social Networks

If you are planning to create your  social network  e.g. similar to Facebook. Here's a short list of alternative software's: Open Source and Free​ http://buddypress.org/  - Wordpress (Open Source and Free) http://elgg.org/  - (Open Source and Free) Commercial Social Networks software http://www.socialengine.com/  ($299 Stand Alone, $29/mo Cloud) http://www.jomsocial.com/  (run with Joomla, need to know CMS) http://www.boonex.com/  (very expensive, $399 for Standard) http://www.anahitapolis.com/ http://www.oxwall.org/ http://sharetronix.com/ http://www.moosocial.com/ http://www.jcow.net/ http://phpdolphin.com http://www.grou.ps  (from free to Commercial, I left my networks and they are selling it  http://www.phpfox.com/  (I used this before, it's hard to maintain. I moved to NING but left too after it was sold to another company) http://www.ning.com  (I don't recommend using this service, it's hard to export your data when it's time to move) S

Learning Vulnerability Scanning by KING.NET

Learning Vulnerability Scanning is fun and easy. So I hope you enjoy reading this short how to guide on how to use vulnerability scanning to secure your servers and networks. NMAP is the swiss tool that you need to learn if you're serious in Cyber Security profession. The NMAP tool can be use with NSE scripting (Nmap Scripting Engine) to automate your tasks. For example using NSE Script using a  single vulnerability (cold fusion)  to scan our test lab machine. root@kali:~# nmap -v -p 80  --script http-vuln-cve2010-2861  10.11.1.220 Starting Nmap 6.47 ( http://nmap.org ) at 2016-07-22 17:34 EDT NSE: Loaded 1 scripts for scanning. NSE: Script Pre-scanning. Initiating ARP Ping Scan at 17:34 Scanning 10.11.1.220 [1 port] Completed ARP Ping Scan at 17:34, 0.04s elapsed (1 total hosts) Initiating Parallel DNS resolution of 1 host. at 17:34 Completed Parallel DNS resolution of 1 host. at 17:35, 13.01s elapsed Initiating SYN Stealth Scan at 17:35 Scanning 10.11.1.220 [1 port] Comp