Google's Gmail service suffered its second outage this month Thursday. The problem was unrelated to Tuesday's two-hour outage at Google News, the company said.
A problem with Google Contacts on Thursday caused many Gmail users to experience slow responsiveness and degraded service for about an hour, Google explained in an e-mail.
"Mail was back to full speed for everyone around 8 a.m. Pacific, and the issue affecting contacts was resolved shortly after," a Google spokesperson said. "We're sorry for the inconvenience. As usual, we'll provide an incident report on the Apps Status Dashboard, where we also gave ongoing status updates as this issue progressed."
Great Public Scrutiny
What makes these Gmail issues notable is not just that they occur, but that they occur under great public scrutiny -- and that's related to cloud computing, noted Google spokesperson Andrew Kovacs.
"Glitches in the cloud don't happen behind closed doors -- you hear about them, whether they're affecting 10 users or 10,000," Kovacs said. "In contrast, when on-premises servers go down -- which they do every day, and more often than the cloud -- the world doesn't know about it."
Google's visibility certainly makes a big difference, noted Gartner Vice President and Distinguished Analyst John Pescatore. However, any service that starts out targeting consumers and later tries to encompass business customers can expect to run into such problems, he said.
"When they start out as consumer services, delivering a quick response may be more important than availability," Pescatore said. "Google has a world-class infrastructure, but clearly has more to do" when it comes to supporting the needs of big businesses.
Google said late last year that its Gmail service had been available more than 99.9 percent of the time -- for everyone, both consumers and business users -- during the previous 12 months. And despite Thursday's service interruption, Kovacs said this figure is still generally accurate.
Not Good Enough
Pescatore said 99.9 percent availability may be fine for universities and some cities, and even equal to what they had when they ran their own systems internally. "But when you look at big enterprises -- the Fortune 500 types -- this is probably not good enough," Pescatore said.
Google attributed its Gmail outage on Sept. 1 to the inadvertent overloading of the company's request routers -- servers which direct web queries to the appropriate Gmail server for response.
"We've turned our full attention to helping ensure this kind of event doesn't happen again," said Google Vice President Ben Treynor, who is also Gmail's site reliability czar. "Some of the actions are straightforward and are already done -- for example, increasing request router capacity well beyond peak demands to provide headroom."
However, outages can be caused by a number of other factors, Pescatore observed. "Sometimes it's the way that software updates are done"; in other cases "it will come from denial-of-service attacks," Pescatore said.
Gartner tells businesses that they need to know what their own service-level agreement (SLA) is internally, "so when they start looking to move to an external provider, they'll know exactly what they need," Pescatore said.
If Google wants to meet their SLA requirements, it will have to "look at how much investment it needs to make and then see if the additional revenue it expects to generate from large enterprises justifies the expense," he said.