Showing posts with label distributed computing. Show all posts
Showing posts with label distributed computing. Show all posts

14 November 2007

Yahoo! Goes Whoop! About Hadoop! (and Pig!)

Now why on earth would Yahoo be doing this?

Yahoo! Inc., a leading global Internet company, today announced that it will be the first in the industry to launch an open source program aimed at advancing the research and development of systems software for distributed computing. Yahoo!'s program is intended to leverage its leadership in Hadoop, an open source distributed computing sub-project of the Apache Software Foundation, to enable researchers to modify and evaluate the systems software running on a 4,000 processor supercomputer provided by Yahoo!. Unlike other companies and traditional supercomputing centers, which focus on providing users with computers for running applications and for coursework, Yahoo!'s program focuses on pushing the boundaries of large-scale systems software research.

Currently, academic researchers lack the hardware and software infrastructure to support Internet-scale systems software research. To date, Yahoo! has been the primary contributor to Hadoop, an open source distributed file system and parallel execution environment that enables its users to process massive amounts of data. Hadoop has been adopted by many groups and is the software of choice for supporting university coursework in Internet-scale computing. Researchers have been eager to collaborate with Yahoo! and tap the company's technical leadership in Hadoop-related systems software research and development.

As a key part of the program, Yahoo! intends to make Hadoop available in a supercomputing-class data center to the academic community for systems software research. Called the M45, Yahoo!'s supercomputing cluster, named after one of the best known open star clusters, has approximately 4,000 processors, three terabytes of memory, 1.5 petabytes of disks, and a peak performance of more than 27 trillion calculations per second (27 teraflops), placing it among the top 50 fastest supercomputers in the world.

M45 is expected to run the latest version of Hadoop and other state-of-the-art, Yahoo!-supported, open-source distributed computing software such as the Pig parallel programming language developed by Yahoo! Research, the central advanced research organization of Yahoo! Inc.

It's cool that Yahoo's backing the open source Hadoop, and doubly cool that one of the projects is called Pig. But it's also shrewd. It's becoming abundantly clear that open beats closed; Google, for all its use of open source software, is remarkably closed at its core. Enter Hadoop, running on a 4,000 processor supercomputer provided by Yahoo, with the real possibility of spawning a truly open rival to Google.... (Via Matt Asay.)

27 May 2006

How to Save the Commons: Compute

There aren't many commons bigger than the atmosphere, nor one whose existence in something near its present state is so critical to our own survival. But in the face of the indisputable scientific consensus that global warming is taking place, it is hard to know what to do.

Well, short of rugby-tackling your elected representatives to the ground and refusing to let go until they do something about the climate crisis, you might at least join this project. It's pretty standard distributed computing stuff: your PC (Windows only, alas) does calculations in the background during idle time, and contributes its bit(s) to the greater whole - in this case making more accurate predictions about climate change.

It hardly requires much commitment from you, just a quick download, plus some electricity (pity that the latter will make the global warming worse). In fact, it's worth taking part just to get the ultra-cool screen-saver, which shows your model - your earth - and its climate, evolving before your very eyes.