[Infowarrior] - Google Remakes Online Empire With ‘Colossus’

Richard Forno rforno at infowarrior.org
Tue Jul 10 07:46:32 CDT 2012


Google Remakes Online Empire With ‘Colossus’

	• By Cade Metz
	• Email Author
	• July 10, 2012 | 
	• 6:30 am | 
	• Categories: Data Centers, Database Software, Software

http://www.wired.com/wiredenterprise/2012/07/google-colossus/

More than a decade ago, Google built a new foundation for its search engine. It was called the Google File System — GFS, for short — and it ran across a sweeping army of computer servers, turning an entire data center into something that behaved a lot like a single machine. As Google crawled the world’s webpages, grabbing data for use in its search engine, it could spread this massive collection of data over all those servers, before using the chips on these machines to crunch everything into a single, searchable index.

GFS was so successful, it soon reinvented the rest of the web. After Google released research papers describing GFS and a sister software platform called MapReduce — the piece that crunches the data — Yahoo, Facebook, and others built their own version of the Google foundation. It was called Hadoop, and this open source platform is now driving a revolution across the world of business software as well.

But Google no longer uses GFS. Two years ago, the company moved its search to a new software foundation based on a revamped file system known as Colossus, and Urs Hölzle — the man who oversees Google’s worldwide network of data centers — tells Wired that Colossus now underpins virtually all of Google’s web services, from Gmail, Google Docs, and YouTube to the Google Cloud Storage service the company offers to third-party developers.

Whereas GFS was built for batch operations — i.e., operations that happen in the background before they’re actually applied to a live website — Colossus is specifically built for “realtime” services, where the processing happens almost instantly. In the past, for instance, Google would use GFS and MapReduce to build a new search index every few days and — as the system matured — every few hours. But with Colossus and its new search infrastructure — known as Caffeine — Google needn’t rebuild the index from scratch. It can constantly update the  existing index with new information in real time.

The move to Colossus foretells a similar move across the rest of the web — and beyond — as is so often the case with the hardware and software that underpins Google’s massively popular web services. Because its services are used by so many people — and it’s juggling so much data — Google is often forced to solve very large problems before the rest of the world, but then others will follow. Colossus is already echoed by recent changes to Hadoop, a platform now used by everyone from Facebook to Twitter and eBay.

So that it’s better suited for realtime applications, Colossus eliminates a “single point of failure” that plagued the original Google File System. With GFS, a “master node” — or master server — oversaw data that was spread across an army of “chunkservers.” These chunkservers stored chunks of data, each about 64 megabytes in size. The problem was that if the master node went down, the whole system went down — at least temporarily. Colossus solved this problem by adding multiple master nodes.

“A single point of failure may not have been a disaster for batch-oriented applications,” Googler Sean Quinlan said, just before Colossus was rolled out, “but it was certainly unacceptable for latency-sensitive applications, such as video serving.”

The new file system also reduces the size of the data chunks down to 1MB. Together with the addition of multiple master nodes, this lets Google store far more far more files across a far larger number of machines.

Hölzle calls Colossus “similar to GFS — but done better after ten years of experience.”

With its search engine, Google has not only dropped GFS. It has dropped MapReduce. Rather than using MapReduce to build a new index every so often, it uses a new platform called “Caffeine” that operates more like a database, where you can read and write data whenever you like.

In similar fashion, Hadoop developers are working to eliminate single points of failure and tweak the platform for use with realtime services. A company called MapR has built a new proprietary version of Hadoop that includes an entirely new file system, while others have worked to remove single points of failure in the open source version of the platform. And in much the same way Google uses distributed databases atop Colossus, Hadoop dovetails with a database called Hbase that’s better suited to realtime services.

Jan Gelin — vice president of technical operations for the Rubicon Project, a realtime trading platform for online ads — recently moved his service to MapR’s proprietary version of Hadoop in part because it eliminated a “single point of failure” that plagued earlier open source versions of the platform. As with GFS, the original incarnation of Hadoop used a single machine — known as the name node — to oversee all other servers in a cluster, and if that one machine went down, the entire process would stop.

“We had a lot of those issues,” Gelin says. “We have roughly a petabyte of data inside of Hadoop, and it was always nerve-wracking when the name node didn’t check-point and you’re wondering if you’re going to lose all your storage or all the pointers to where your data is.

“That’s OK if you’re doing research stuff, but if you’re depending on your data in the way we’re going to be now, it’s not.”

During a recent event in Silicon Valley, Mike Olson — the CEO Cloudera, another Hadoop outfit — said that this problem has also been fixed in the open source version of Hadoop.

Though Google has not open sourced the code behind Colossus, outside developers can still make use of the file system. As Hölzle points out, Colossus underpins Google Cloud Storage, the online storage service Google offers to developers across the globe in much the same way Amazon offers its S3 storage service.

---
Just because i'm near the punchbowl doesn't mean I'm also drinking from it.



More information about the Infowarrior mailing list