Saturday, November 8, 2008

Open Cloud Computing

Cloud-computing platforms such as Amazon's Elastic Compute Cloud (EC2), Microsoft's Azure Services Platform, and Google App Engine have given many businesses flexible access to computing resources, ushering in an era in which, among other things, startups can operate with much lower infrastructure costs. Instead of having to buy or rent hardware, users can pay for only the processing power that they actually use and are free to use more or less as their needs change.

However, relying on cloud computing comes with drawbacks, including privacy, security, and reliability concerns. So there is now growing interest in open-source cloud-computing tools, for which the source code is freely available. These tools could let companies build and customize their own computing clouds to work alongside more powerful commercial solutions.

One open-source software-infrastructure project, called Eucalyptus, imitates the experience of using EC2 but lets users run programs on their own resources and provides a detailed view of what would otherwise be the black box of cloud-computing services.

Another open-source cloud-computing project is the University of Chicago's Globus Nimbus, which is widely recognized as having pioneered the field. And a European cloud-computing initiative coordinated by IBM, called RESERVOIR, features several open-source components, including OpenNebula, a tool for managing the virtual machines within a cloud. Even some companies, such as Enomaly and 10gen, are developing open-source cloud-computing tools.

Rich Wolski, a professor in the computer-science department at the University of California, Santa Barbara, who directs the Eucalyptus project, says that his focus is on developing a platform that is easy to use, maintain, and modify. "We actually started from first principles to build something that looks like a cloud," he says. "As a result, we believe that our thing is more malleable. We can modify it, we can see inside it, we can install it and maintain it in a cloud environment in a more natural way."

Reuven Cohen, founder and chief technologist of Enomaly, explains that an open-source cloud provides useful flexibility for academics and large companies. For example, he says, a company might want to run most of its computing in a commercial cloud such as that provided by Amazon but use the same software to process sensitive data on its own machines, for added security. Alternatively, a user might want to run software on his or her own resources most of the time, but have the option to expand to a commercial service in times of high demand. In both cases, an open-source cloud-computing interface can offer that flexibility, serving as a complement to the commercial service rather than a replacement.

Indeed, Wolski says that Eucalyptus isn't meant to be an EC2 killer (for one thing, it's not designed to scale to the same size). However, he believes that the project can make a productive contribution by offering a simple way to customize programs for use in the cloud. Wolski says that it's easier to assess a program's performance when it's possible to see how it operates both at the interface and from within a cloud.

Wolski says that Eucalyptus will also imitate Amazon's popular Simple Storage Surface, which allows users to access storage space on demand, as well as its Elastic IP addresses, which keeps the address of Web resources the same, even if the physical location changes.

Ignacio Llorente, a professor in the distributed systems architecture group at the Universidad Complutense de Madrid, in Spain, who works on OpenNebula, says that Eucalyptus's main advantage is that it uses the popular EC2 interface. However, he adds that "the open-source interface is only one part of the solution. Their back-end [the system's internal management of physical resources and virtual machines] is too basic. A complete cloud solution requires other components." Llorente says that Eucalyptus is just one example of a growing ecosystem of open-source cloud-computing components.

Wolski expects many of Eucalyptus's users to be academics interested in studying cloud-computing infrastructure. Although he doubts that such a platform would be used as a distributed system for ordinary computer users, he doesn't discount the possibility. "You can argue it both ways," he notes. But Wolski says that he thinks some open-source cloud-computing tool will become important in the future. "If it's not Eucalyptus, I suspect [it will be] something else," he says. "There will be an open-source thing that everyone gets excited about and runs in their environment."

Cracking The Physical Internet

For decades, the physical Internet has been in a state of suspended animation. It was designed in the 1960s to transmit files and e-mail, and even the advent of YouTube, Internet phone calls, streaming music, and networked video games have done little to change it. In part, that's because the only network big enough to provide a test bed for new hardware tricks is the Internet itself; in part, it's because the routers and switches that make up the Internet are closed technologies, sold by a handful of companies.

A project led by Nick McKeown of Stanford University, however, has begun to open up some of the most commonly used network hardware, from companies such as HP, Cisco, NEC, and Juniper. Allowing researchers to fiddle with Internet hardware, McKeown says, will make the Internet more secure, more reliable, more energy efficient, and more pervasive.

"In the last 10 years, there's been no transfer of ideas into the [Internet] infrastructure," says McKeown, a professor of electrical engineering and computer science. "What we're trying to do is enable thousands of graduate students to demonstrate ideas at scale. That could lead to a faster rate of innovation, and ultimately these ideas can be incorporated into products."

Under the auspices of a project called OpenFlow, McKeown's team has secured permission from equipment vendors to write a small amount of code that, essentially, grants access to a critical part of a network or switch called a flow table. When a packet--a chunk of data--arrives at a switch, for instance, software in the switch looks up instructions on the flow table to decide where to send the packet.

"What OpenFlow does is give you direct access to the flow table, to add and delete instructions," says McKeown. "It's a completely brain-dead idea." But it hasn't been implemented before because the assumption was that vendors wouldn't open up their hardware. "We figured out that there was a minimum amount of access to the flow table that network vendors were okay with allowing that was still extremely useful to us for testing out our ideas," McKeown says.

At a recent demonstration, McKeown and his team showed off their ability to control the traffic in a network via a simple cartoonlike interface on a PC. One test was designed to let people play a first-person-shooter video game on laptops, while moving between wireless access points, without losing any information or experiencing any lags. (First-person-shooter games are commonly used in network tests because they are resource intensive, and if the network fails, it's immediately obvious.) In the demonstration, the researchers instructed a server on Stanford's network to find the most efficient connection to the device at any given moment. "It's a good idea for a game, but today you can't do that because you can't control the routing," McKeown says.

In another demonstration, the researchers showed that OpenFlow can enable direct manual control of network traffic: using a mouse cursor, researchers rerouted data traffic from Stanford to a network in Japan. "The goal is not to show that you are controlling your network from a mouse, but that you now have control," McKeown says. "It's not left up to whatever the box vendor decides . . . This infrastructure that's been held close is being opened and democratized."

OpenFlow is creating an entirely new field of research, with benefits that the average person could enjoy within the next couple of years. "This could take over the Internet," says Rick McGeer, a researcher at HP Labs who's working on projects similar to McKeown's. "This actually looks like an elegant, efficient solution that we can use to take all of these ideas that we've been exploring for the past five years and start implementing them, and start putting them in the network."

There could, however, still be some challenges ahead, McGeer warns. First, he says, vendors would need to continue to support the project as it moves out of the lab and onto the live Internet. Second, companies who provide Internet service need to see the benefits of opening up their networks. "If I had to guess what would happen first," McGeer says, "Comcast might want to offer multicast trees [a way to distribute the burden of data-intensive Web functions] for efficient YouTube videos, and they'll start to put that in for their services."

McKeown sees the potential to completely open up the airwaves, allowing portable devices to access any wireless network that they can detect. In a city, for instance, a Wi-Fi-enabled cell phone can probably recognize dozens of networks, McKeown says--from Wi-Fi access points to the cell networks maintained by different carriers. But if a user needs more bandwidth for a download, or a stronger signal for a clearer call, or if she moves out of range of a wireless transmitter, switching to another network is difficult, if not impossible. "Our goal is seamless mobility," McKeown says. "We'd love to come up with a way to re-architect cellular wireless networks. But that's further out. We're talking 10 years."