Recent Articles

Hot Cloud 2009: The Case for Enterprise Ready Virtual Private Clouds

The work I presented at Hot Cloud was about what enterprise customers need from cloud computing platforms, and how we can go about building enterprise clouds that are more secure, transparent, and flexible.

You can find a copy of our paper here, and my slides here. Or you can get a summary of our ideas below.

Here are the three key features we feel are lacking from existing cloud platforms:

Security: Enterprises need strong security guarantees about the isolation of both the computation and network resources they are getting from the cloud. Existing systems rely on firewall rules for security that must be configured on a per-VM basis. While firewalls are a very powerful form of access control, they are incredibly fine grain and need to be carefully configured. This is a especially a problem in highly dynamic (ie. cloud) environments where new VMs are often being created or moved between servers.

Transparency: Another problem with cloud computing is that the resources it gives you are completely separated from the systems an enterprise is already running within its data centers. This makes it difficult to deploy applications since you can’t get the abstraction of having your cloud resources seamlessly connected to your existing LANs within the enterprise.

Resource Flexibility: There are two issues here. First, existing cloud platforms grant users very limited control over the network resources connected to their VMs.  This means, for example, that it is impossible to do something like reserve a high bandwidth link between a pair of VMs, and certainly not between a VM and the enterprise site that is going to be accessing it. Secondly, cloud platforms are not as flexible as they should be: if you replicate a VM to increase the processing power of an application you need to deal with these security and transparency issues all over again.

To help provide these three features, we propose the idea of a Virtual Private Cloud, that uses VPNs to securely connect groups of VMs within a cloud data center back to the enterprise sites that will use them.  VPNs make it so that the cloud resources are only accessible by other members of the same VPN.  This is a much coarser grain access control mechanism than firewalls, but it is much cleaner and we use MPLS based VPNs that have the benefit of being both highly scalable for enterprises that may run many hundreds or thousands of VMs, and that require no endhost configuration on the VMs — the VPN is entirely setup at the routers at the cloud and enterprise sites.  Finally, there is the option of using layer 2 VPNs (a Virtual Private LAN Service) to bridge the cloud computing data center and enterprise networks, giving the abstraction that cloud resources are seamlessly connected to the enterprise’s own LAN.

We are building a system that will implement this sort of system, and are exploring how it can be used to simplify VM migration over the WAN and for providing high availability services capable of seamlessly failing an application over from one cloud data center to another.

Quick Tips: Adding a Fancy Header in Latex

I’ve recently been pumping out a lot of technical report versions of my papers to add to our department’s library. Here is some code I copied from a former student in my lab to produce fancy headers at the top of each page in a latex document.  This would probably work well for adding copyright notifications as well.


\usepackage{fancyhdr}
\pagestyle{fancy}
\fancyhf{}
\fancypagestyle{plain}{% define header for first page of document
\fancyhead[L]{University of XXX, Technical Report 2009-YY}
\fancyhead[R]{\thepage}
}
%Header for remaining pages in document
\fancyhead[L]{University of XXX, Technical Report 2009-YY}
\fancyhead[R]{\thepage}

You can replace the “University of XXX” bit with whatever you want to appear at the top of each page.  The first declaration defines the header for the document’s title page, the second is used for all remaining pages. More details on the fancyhdr package are here.

Quick Tips: Latex QED symbol

Some Latex document styles include a definition for a QED symbol (typically a box), but others do not.  If you are using a style that doesn’t come with one, you can quickly add your own definition of QED, just add:

\newcommand{\qed}{\hfill \mbox{\raggedright \rule{.07in}{.1in}}}

at the start of your document, and then you can use \qed to make the symbol wherever you need.

Speeding Up Migration with Page Sharing

Update: Looking at this now, I’ve definitely become fully convinced that it is a good idea.  Clearly you need to be a little careful that your pages match up at each end, but as long as you keep an intelligent cache at each end, you definitely should be able to significantly reduce the amount of migration traffic due to duplicate memory contents.

Kevin Lawton (author of old-school virtualization tool Bochs), recently wrote an article (and followup) on how you could speed up VM migration in a data center by exploiting things like page duplication between the source and destination machines.  The idea is that a lot of VMs have common memory pages that you wouldn’t actually need to copy over.  He references some of my own work that looked at the amount of sharing that actually occurs between VMs, the first (but hopefully not last) time I’ve ever seen a link to one of my papers in a random blog post in my RSS feed!

The only problem I see has to do with detecting when pages are truly identical. When you do page sharing between VMs on a single host, you detect the similarity by producing a short (32 or 64 bit) hash for each page in memory. If you scan two pages and they produce the same hash, then those pages are very likely identical. I say “very likely“, because it could just be a hash collision, and you need to actually scan all the bits in each page before you can truly know that the two pages are identical.

When you do this on a single machine, it’s not too big a deal to scan two pages and compare their bits, but if you are trying to verify that pages on the source and destination of a migration are identical, it is a big problem. Obviously you can’t just copy the page over the network to do the comparison, since that is what you were trying to avoid in the first place. I guess the only solution is to use a longer hash value (thus reducing the chance for collisions) and really hope that you don’t have a malicious VM at the destination that is trying to corrupt your memory by purposefully creating memory pages that will collide with your content.

I think it is a neat idea that would generally work in practice, but you will need a pretty smart cache at each end to make sure your really keeping the pages consistent.

Another idea would be to use a “post copy” based approach that tries to get the VM started up on the destination machine as quickly as possible, deferring copying most memory pages until after it has already started.  You might be able to use this to quickly unload a host that is approaching the overload limit, although the migrated VMs may see a larger performance penalty because of how the migration technique works.

The End of Desktops (and all your applications)

I have to agree with Google’s Eric Schmidt about the importance of cloud services on the future of everyday computing. Desktop style applications that run entirely on your own computer don’t have much life left because 1) people will have too many different devices, so keeping them all synced with local storage is a pain, 2) it’s easier for application developers to maintain a single online version of an app instead of dealing with pushing out updates and bug fixes to users, 3) forcing users to go online to get an app prevents piracy, 4) applications can get as much or as little computation and storage power as they need from the cloud, 5) etc.  Sadly, I think it might be #3 that is the real motivation in the end for many companies.

Cloud based services are also better for the environment.  If the average everyday computer can be reduced down to a basic thin client for accessing remote cloud services, that reduces the cost and energy usage of home devices.  The applications running in the cloud can in turn exploit massive degrees of multiplexing to reduce their own energy costs.  As it stands today, most peple’s computers are far more powerful than they really need to be, and they spend a lot of time sitting around idle.  You might as well give consumers devices that are as simple as possible.  Make them so simple that they don’t crash all the time.   Make it so they don’t require technically adept family members to service them a few times a year. Then people will spend less time being frustrated by their computers, and more time able to use the applications running on them.

UMass Thesis Proposal Writing

I’ve started writing my Ph.D proposal recently. It’s pretty exciting because it makes me realize that I actually have accomplished quite a bit since coming here–a good thing, since it has been 4 years already!  I also enjoy writing the proposal because you just need to cover the high level purpose of everything and can ignore all the gritty details. I’m pretty happy with the big picture I’m getting so far.

For future reference, you can get the Latex style file and template for the proposal/thesis from this site.

Clean Slate

I started writing on this blog about a year ago, but never got anywhere with it.

Now I am giving it the old “reboot” and hopefully this time around I’ll turn it into something more useful.