Showing posts with label exabytes. Show all posts
Showing posts with label exabytes. Show all posts

08 January 2007

Google Reaches for the Stars

One of the most important shifts in science at the moment is towards dealing with the digital deluge. Whether in the field of genomics, particle physics or astronomy, science is starting to produce data in not just gigabytes, or even terabytes, but petabytes, exabytes and beyond (zettabytes, yottabytes, etc.).

Take the Large Synoptic Survey Telescope, for starters:

The Large Synoptic Survey Telescope (LSST) is a proposed ground-based 8.4-meter, 10 square-degree-field telescope that will provide digital imaging of faint astronomical objects across the entire sky, night after night. In a relentless campaign of 15 second exposures, LSST will cover the available sky every three nights, opening a movie-like window on objects that change or move on rapid timescales: exploding supernovae, potentially hazardous near-Earth asteroids, and distant Kuiper Belt Objects. The superb images from the LSST will also be used to trace billions of remote galaxies and measure the distortions in their shapes produced by lumps of Dark Matter, providing multiple tests of the mysterious Dark Energy.

How much data?

Over 30 thousand gigabytes (30TB) of images will be generated every night during the decade-long LSST sky survey.

Or for those of you without calculators, that's 10x365x30x1,000,000,000,000 bytes, roughly 100 petabytes. And where there's data, there's also information; and where there's information...there's Google:

Google has joined a group of nineteen universities and national labs that are building the Large Synoptic Survey Telescope (LSST).

...

"Partnering with Google will significantly enhance our ability to convert LSST data to knowledge," said University of California, Davis, Professor and LSST Director J. Anthony Tyson. "LSST will change the way we observe the universe by mapping the visible sky deeply, rapidly, and continuously. It will open entirely new windows on our universe, yielding discoveries in a variety of areas of astronomy and fundamental physics. Innovations in data management will play a central role."

(Via C|net.)

13 December 2005

Driving Hard

Hard discs are the real engines of the computer revolution. More than rising processing speeds, it is constantly expanding hard disc capacity that has made most of the exciting recent developments possible.

This is most obvious in the case of Google, which now not only searches most of the Web, and stores its (presumably vast) index on cheap hard discs, but also offers a couple of Gbytes of storage to everyone who uses/will use its Gmail. Greatly increased storage has also driven the MP3 revolution. The cheap availability of Gigabytes of storage means that people can - and so do - store thousands of songs, and now routinely expect to have every song they want on tap, instantly.

Yet another milestone was reached recently, when even the Terabyte (=1,000 Gbytes) became a relatively cheap option. For most of us mere mortals, it is hard to grasp what this kind of storage will mean in practice. One person who has spent a lot of time thinking hard about such large-scale storage and what it means is Jim Gray, whom I had the pleasure of interviewing last year.

On his Web site (at Microsoft Research), he links to a fascinating paper by Michael Lesk that asks the question How much information is there in the world? (There is also a more up-to-date version available.) It is clear from the general estimates that we are fast approaching the day when it will be possible to have just about every piece of data (text, audio, video) that relates to us throughout our lives and to our immediate (and maybe not-so-immediate) world, all stored, indexed and cross-referenced on a hard disc somewhere.

Google and the other search engines already gives us a glimpse of this "Information At Your Fingertips" (now where did I hear that phrase before?), but such all-encompassing Exabytes (1,000,000 Terabytes) go well beyond this.

What is interesting is how intimately this scaling process is related to the opening up of data. In fact, this kind of super-scaling, which takes us to realms several orders of magnitude beyond even the largest proprietary holdings of information, only makes sense if data is freely available for cross-referencing (something that cannot happen if there are isolated bastions of information, each with its own gatekeeper).

Once again, technological developments that have been in train for decades are pushing us inexorably towards an open future - whatever the current information monopolists might want or do.