<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>TW - Virtualization, Research, Grad School &#187; Virtualization</title>
	<atom:link href="http://www.tim-wood.net/research/category/virtualization/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.tim-wood.net/research</link>
	<description></description>
	<lastBuildDate>Mon, 10 May 2010 19:27:20 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Disasters &amp; Disaster Recovery in the Cloud</title>
		<link>http://www.tim-wood.net/research/2010/05/disasters-disaster-recovery-in-the-cloud/</link>
		<comments>http://www.tim-wood.net/research/2010/05/disasters-disaster-recovery-in-the-cloud/#comments</comments>
		<pubDate>Mon, 10 May 2010 19:23:51 +0000</pubDate>
		<dc:creator>twood</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Fault Tolerance]]></category>
		<category><![CDATA[Papers]]></category>

		<guid isPermaLink="false">http://www.tim-wood.net/research/?p=210</guid>
		<description><![CDATA[This past weekend, Amazon EC2 experienced a power outage that brought down servers for about seven hours.  Amazon has experienced a number of outages over the last few years&#8211;not surprising given the size of their operations.  However, this makes it clear how important disaster recovery and high availability will be as more services are deployed [...]]]></description>
			<content:encoded><![CDATA[<p>This past weekend, Amazon EC2 experienced a power outage that <a href="http://www.datacenterknowledge.com/archives/2010/05/10/amazon-addresses-ec2-power-outages/">brought down servers for about seven hours</a>.  Amazon has <a href="http://blog.centripetalsoftware.com/2009/12/amazon-ec2-downtime-reminds-us-of-need.html">experienced</a> a <a href="http://techcrunch.com/2008/04/07/amazon-web-services-gets-another-hiccup/">number</a> of <a href="http://techcrunch.com/2008/02/15/amazon-web-services-goes-down-takes-many-startup-sites-with-it/">outages</a> over the last few years&#8211;not surprising given the size of their operations.  However, this makes it clear how important disaster recovery and high availability will be as more services are deployed into the cloud, and also suggests that achieving the highest level of reliability may require utilizing redundant services from multiple cloud providers.</p>
<p>This is something I&#8217;ve been thinking about quite a bit lately, and in fact just a few days ago I was happy to learn that our paper, <strong><a href="http://www.cs.umass.edu/~twood/pubs/dr-cloud.pdf">Disaster Recovery as a Cloud Service: Economic Benefits &amp; Deployment Challenges</a></strong>, has been accepted into this year&#8217;s <a href="http://www.usenix.org/events/hotcloud10">Workshop on Hot Topics in Cloud Computing (HotCloud 2010)</a>.  In our paper, we survey why we think cloud computing platforms are going to become increasingly popular for providing cheap disaster recovery services.</p>
<p>Clouds can be used to provide a variety of backup mechanisms ranging from cold replicas that are periodically synchronized up to hot standbys that are always in sync and can take over as soon as a failure is detected.  In practice, we think that a middle class of warm replicas is where the cloud can provide the greatest benefit.  A warm replica could be implemented as an EC2 VM that is not aways running, but whose disk (an EBS volume) is kept regularly up to date by a replication manager VM.  This replication manager can handle synchronizing the disk state for a large number of applications, but the customer will not have to pay for the active VM costs of those applications until a failure actually occurs and the VMs are booted up.</p>
<p><a href="http://www.cs.umass.edu/~twood/pubs/dr-cloud.pdf">Check the paper</a> for all the details, including a cost analysis of providing DR for various application types.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tim-wood.net/research/2010/05/disasters-disaster-recovery-in-the-cloud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Improving Data Center Resource Management, Deployment, and Availability with Virtualization</title>
		<link>http://www.tim-wood.net/research/2009/07/proposal/</link>
		<comments>http://www.tim-wood.net/research/2009/07/proposal/#comments</comments>
		<pubDate>Fri, 24 Jul 2009 15:24:41 +0000</pubDate>
		<dc:creator>twood</dc:creator>
				<category><![CDATA[Grad School]]></category>
		<category><![CDATA[Papers]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://www.tim-wood.net/research/?p=145</guid>
		<description><![CDATA[That&#8217;s the title of my thesis proposal, which attempts to cram all the work I&#8217;ve done over the past four years in just a few words. In the end, I&#8217;m pretty happy with the result&#8211;I&#8217;ve been able to tie together the various projects I&#8217;ve worked on to show how virtualization provides powerful new techniques for [...]]]></description>
			<content:encoded><![CDATA[<p>That&#8217;s the title of my thesis proposal, which attempts to cram all the work I&#8217;ve done over the past four years in just a few words. In the end, I&#8217;m pretty happy with the result&#8211;I&#8217;ve been able to tie together the various projects I&#8217;ve worked on to show how virtualization provides powerful new techniques for deploying applications, more efficiently managing resources, and providing high reliability in large data centers.</p>
<p>If you are interested, you can read the <a onclick="javascript: pageTracker._trackPageview('/pubs/proposal.pdf');" href="http://www.cs.umass.edu/~twood/pubs/proposal.pdf">full version</a>, or look through <a onclick="javascript: pageTracker._trackPageview('/pubs/proposal-slides.pdf');" href="http://www.cs.umass.edu/~twood/pubs/proposal-slides.pdf">my slides</a>.  It should make for absolutely <em>thrilling</em> bed time reading.</p>
<p>Here is the executive summary of what I&#8217;ve worked on:</p>
<h3 style="text-align: center;">Deployment</h3>
<p>I start by looking at the deployment challenges of transitioning to a virtual environment and figuring out where to place VMs. This is an interesting area because virtualization can provide great benefits such as improved server consolidation, but also adds new challenges in the form of virtualization overheads.</p>
<p><strong>MOVE (Modeling Overheads of Virtual Environments) </strong><strong> </strong></p>
<p>When you first consider transitioning from running applications natively to using virtual machines, it is important to understand how application resource requirements will change due to the overheads incurred by the virtualization layer. The MOVE project is designed to help predict these resource changes by building a regression model that relates the native and virtual platforms. This was work that I started during an internship at HP Labs in the summer of 2007, working with Lucy Cherkasova.</p>
<ul>
<li><strong><a onclick="javascript: pageTracker._trackPageview('/pubs/middleware');" href="http://www.cs.umass.edu/%7Etwood/pubs/middleware08.pdf">Profiling and Modeling Resource Usage of Virtualized Applications.</a></strong> Timothy Wood, Ludmila Cherkasova, Kivanc Ozonat, and Prashant Shenoy. 	  In proceedings of <em>ACM International Conference on Middleware 2008.</em></li>
</ul>
<p><strong>Memory Buddies &#8211; Guiding VM placement with memory information<br />
</strong></p>
<p>Once you know your resource requirements, you need to figure out <em>where</em> to put each of your virtual machines.  The Memory Buddies project tries to place virtual machines in order to maximize the amount of memory sharing that can be achieved &#8212; if VMs are running similar operating systems or applications, then the virtualization layer can share copies of these duplicated pages. In order to make this practical in a data center with many thousands of VMs, we propose an efficient fingerprinting technique that uses Bloom filters to quickly compare virtual machine memory contents.</p>
<ul>
<li><strong><a onclick="javascript: pageTracker._trackPageview('/pubs/membuds');" href="http://lass.cs.umass.edu/papers/pdf/VEE09-membuds.pdf">Memory Buddies: Exploiting Page Sharing for Smart Colocation in Virtualized Data Centers</a></strong>.  Timothy Wood, Gabriel Tarasuk-Levin, Prashant Shenoy, Peter Desnoyers, Emmanuel Cecchet, and Mark Corner.           In proceedings of the <em>International Conference on Virtual Execution Environments, VEE 2009.</em></li>
</ul>
<h3 style="text-align: center;">Resource Management</h3>
<p>Making data centers more efficient is a key concern throughout all of my work.  Virtualization&#8217;s greatest benefit comes in the promise of improved server utilization, leading to lower hardware costs and decreased energy consumption.</p>
<p><strong>Sandpiper &#8211; automated VM loadbalancing<br />
</strong></p>
<p>Alright, now we&#8217;ve figured out initial resource allocations and placements for all of our virtual machines, but those initial decisions may not be sufficient (or efficient) if an application&#8217;s workload changes over time. Sandpiper is a system which monitors the resource utilization and performance of a set of VMs and dynamically adjusts their resources or migrates them between hosts in order to prevent servers from becoming overloaded. This was the first project I worked on when I came to grad school, and now there are several commercial products out there doing similar things. We recently revised and extended this paper for a journal.</p>
<ul>
<li><strong><a onclick="javascript: pageTracker._trackPageview('/pubs/sandpiper');" href="http://www.cs.umass.edu/%7Etwood/pubs/NSDI07.pdf">Black-box and Gray-box Strategies for Virtual Machine Migration.</a></strong> Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif. 	  <em>Proceedings of the Fourth Symposium on Networked Systems Design and Implementation (NSDI), Cambridge, MA, April 2007.</em></li>
<li><strong><a onclick="javascript: pageTracker._trackPageview('/pubs/sandpiper-journal.pdf');" href="http://www.cs.umass.edu/%7Etwood/pubs/sandpiper-journal.pdf">Sandpiper: Black-box and Gray-box Resource Management for Virtual Machines.</a></strong> Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif.           To appear in <em>Computer Networks Journal Special Issue on Virtualized Data Centers 2009.</em> (Extended version of NSDI 07 paper)</li>
</ul>
<h3 style="text-align: center;">Reliability</h3>
<p>High performance systems are only useful if they are reliable. The remaining work for my thesis uses virtualization to decrease the cost of high availability and fault tolerance systems.</p>
<p><strong>ZZ: Cheap Practical Byzantine Fault Tolerance</strong></p>
<p>Byzantine Fault Tolerance is a way of providing very strong reliability guarantees, even in the face of malicious users or application components.  Unfortunately, BFT has a very high cost because each application request must be executed <em>2f+1</em> times in order to handle <em>f</em> simultaneous faults. In ZZ, we try to reduce this cost down to only <em>f+1</em>, by using an additional <em>f</em> sleeping VM replicas which are only woken up after a fault is detected.</p>
<ul>
<li><strong><a onclick="javascript: pageTracker._trackPageview('/pubs/zz-tr');" href="http://lass.cs.umass.edu/projects/virtualization/zz.html">ZZ: Cheap Practical BFT Using Virtualization.</a></strong> Timothy Wood, Rahul Singh, Arun Venkataramani, and Prashant Shenoy.	<em>University of Massachusetts  Technical Report TR14-08, 2008.</em></li>
</ul>
<p><strong>CloudNet: Wide Area Resource Management and Availability</strong></p>
<p>My most recent work was started while at AT&amp;T in Fall 2008, and looks at how VPNs can be combined with cloud computing platforms to make data center resources appear seamlessly connected to an enterprise&#8217;s existing infrastructure. We are further exploring this area to see how we can provide disaster recovery services so that if a data center becomes unavailable, the critical applications running within it can transparently fail over to servers at a different data center.</p>
<ul>
<li><strong><a onclick="javascript: pageTracker._trackPageview('/pubs/hotcloud.pdf');" href="http://www.cs.umass.edu/%7Etwood/pubs/hotcloud.pdf">The Case for Enterprise-ready Virtual Private Clouds.</a></strong> Timothy Wood, Alexandre Gerber, K.K. Ramakrishnan, and Jacobus van der Merwe.           In proceedings of the <em>Workshop on Hot Topics in Cloud Computing, HotCloud 2009.</em></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.tim-wood.net/research/2009/07/proposal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hot Cloud 2009</title>
		<link>http://www.tim-wood.net/research/2009/06/hot-cloud-2009/</link>
		<comments>http://www.tim-wood.net/research/2009/06/hot-cloud-2009/#comments</comments>
		<pubDate>Tue, 16 Jun 2009 00:21:14 +0000</pubDate>
		<dc:creator>twood</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Conferences]]></category>

		<guid isPermaLink="false">http://www.tim-wood.net/research/?p=94</guid>
		<description><![CDATA[Here are my notes on some of the interesting talks at Hot Cloud 2009. The full list of talks and papers are available at the hot cloud site. There were interesting talks on a variety of topics, but my notes here focus mostly on cloud platforms and work around resource provisioning from the first half [...]]]></description>
			<content:encoded><![CDATA[<p>Here are my notes on some of the interesting talks at Hot Cloud 2009. The full list of talks and papers are available at the <a href="http://www.usenix.org/event/hotcloud09/tech/tech.html">hot cloud site</a>. There were interesting talks on a variety of topics, but my notes here focus mostly on cloud platforms and work around resource provisioning from the first half of the day.</p>
<h4>Open Cirrus Cloud Computing Testbed: Federated Data Centers for Open Source Systems and Services Research</h4>
<p>Roy Campbell, Indranil Gupta, Michael Heath, and Steven Y. Ko, <em>University of Illinois at Urbana-Champaign;</em> Michael Kozuch, <em>Intel Research;</em> Marcel Kunze, <em>KIT, Germany;</em> Thomas Kwan, <em>Yahoo!;</em> Kevin Lai, <em>HP Labs;</em> Hing Yan Lee, <em>IDA, Singapore;</em> Martha Lyons and Dejan Milojicic, <em>HP Labs;</em> David O&#8217;Hallaron, <em>Intel Research;</em> Yeng Chai Soh, <em>IDA, Singapore</em></p>
<p>This is a very large (more than 10K nodes spread across 9 sites) testbed being setup by HP and others to study large scale cloud computing problems. They are focusing on computation provisioning issues, and can provide users with either full physical or virtual resources.</p>
<h4>Nebulas: Using Distributed Voluntary Resources to Build Clouds</h4>
<p>Abhishek Chandra and Jon Weissman, <em>University of Minnesota</em></p>
<p>The idea here is to explore the potential for creating peer-to-peer style cloud computing platforms that uses resources provided by volunteers similar to something like SETI @home.  I like this idea a lot, but there have been many attempts at making things like volunteer based network file systems which never quite took off, and this seems even harder.  The difficulty will be determining what the basic platform that people are given access to is like (ie. can you run any app you want within some VM, or is it a specific platform you must develop your app against to make it work), and how do you make the resources shared by users not impact their own application performance.  People are pretty willing to share network bandwidth and disk space, but that is because those are generally over provisioned resources.  CPU is over-provisioned in a different way &#8212; most of the time desktop users use only a fraction of the power provided by their system, but when they do decied to go do something computation intensive, they expect it to respond quickly.  This also reminds me of the &#8220;<a href="http://portal.acm.org/citation.cfm?id=1267370">transparent memory contribution</a>&#8221; work done by Jim Cipar when he was still at UMass, since it had to deal with similar issues of volunteering resources in as transparent a way as possible.</p>
<h4>The Case for Enterprise-Ready Virtual Private Clouds</h4>
<p>Timothy Wood and Prashant Shenoy, <em>University of Massachusetts Amherst;</em> Alexandre Gerber, K.K. Ramakrishnan, and Jacobus Van der Merwe, <em>AT&amp;T Labs—Research</em></p>
<p>I thought this paper was really great, but maybe I&#8217;m biased since I wrote it.  I&#8217;ve written a separate <a href="http://www.tim-wood.net/research/2009/06/hc09-vpcs/">blog post about my own work</a>, but the gist is that current cloud computing platforms are insufficient for enterprise users, and we propose using network virtualization techniques to make seamless and secure connections between the cloud resources and enterprise sites.</p>
<h4>ElasTraS: An Elastic Transactional Data Store in the Cloud</h4>
<p>Sudipto Das, Divyakant Agrawal, and Amr El Abbadi, <em>University of California, Santa Barbara</em></p>
<p>The idea here is that databases currently don&#8217;t scale well into the cloud.  Instead people are using simpler (but more easily scaled) key-value stores to keep track of data in the cloud.  This doesn&#8217;t work well because key stores don&#8217;t provide the transaction and consistency features of real databases. They propose Elastras &#8211; a scalable, trasactional data store based around the idea of partitioned databases. It wasn&#8217;t clear how difficult the problem of determining how to partition data is in the first place, as it tends to be application specific.</p>
<h4>Reflective Control for an Elastic Cloud Application: An Automated Experiment Workbench</h4>
<p>Azbayar Demberel, Jeff Chase, and Shivnath Babu, <em>Duke University</em></p>
<p>The idea of reflection is to make an application change its behavior based on the available resources.  This could be based on energy or computation resources.  This lets you opportunistically exploit surplus resources, and to defer work during congestion. An example of a reflective application is a digital experiment (generally has large data sets, can be partitioned, does not have strong time requirements). Seems to me like this is useful for any batch processing style application.  The work focuses on figuring out how to determine the utility of running different experiments depending on what resources are available, which may be very difficult since the experiment design space can be huge. It seems to me that the idea of reflective applications is useful even at a more basic level to both let applications be aware of what resources are available and for service providers to know what applications desire.</p>
<h4>Colocation Games and Their Application to Distributed Resource Management</h4>
<p>Jorge Londoño, Azer Bestavros, and Shang-Hua Teng, <em>Boston University</em></p>
<p>This paper explores the placement problem within data centers using game theory techniques. In general they find that a Nash Equilibrium will not be reached, but that in a restricted environments it will always converge.  I&#8217;ll be interested to look through their results more carefully to better understand how the potential for multiplexing resources in these environments can be reduced based on the self-interests of users.</p>
<h4>Virtual Putty: Reshaping the Physical Footprint of Virtual Machines</h4>
<p>Jason Sonnek and Abhishek Chandra, <em>University of Minnesota</em></p>
<p>The idea here is that the physical footprint required by a VM can vary depending on its environment. For example, VMs colocated together may be able to share memory, or may require much fewer network resources if they can put on the same LAN. To exploit this, you need to estimate the &#8220;virtual&#8221; footprint of a VM that captures how its physical requirements can change depending on its environment.  The first challenge here is to efficiently capture this model &#8212; you will only be able to get a significant benefit from this kind of technique if it is being applied across a very large number of VMs (my memory sharing work suggests this as well). Second is the issue of determining how to deal with applications changing over time &#8211; memory and network communication patterns may change over time, so how often do you need to recompute the footprint?</p>
<h4>Statistical Machine Learning Makes Automatic Control Practical for Internet Datacenters</h4>
<p>Peter Bodík, Rean Griffith, Charles Sutton, Armando Fox, Michael Jordan, and David Patterson, <em>University of California, Berkeley</em></p>
<p>The goal here is to model application performance and automate management online. Models are based on data gathered from the system as it is running, allowing it to be adapted as more data is produced. Has some automated techniques to detect phase shifts in application type that will require a new model. The problem with these systems is always a question of how well they can deal with data that is outside of their training data.  One of this system&#8217;s benefits is supposed to be that it doesn&#8217;t rely on training data produced from experimental setups, and instead builds the model on the fly as data is gathered. Bbut of course that may mean that the models are only really applicable for &#8220;normal&#8221; operating conditions, and that it will not be able to make reasonable predictions for what will happen after a load spike.</p>
<h4>Other Hot Cloud Reports</h4>
<p>I&#8217;ll add any other hot cloud blogs or reports as I find them (or comment below).</p>
<ul>
<li><a href="http://www.networkworld.com/news/2009/061009-cloud-computing-research-projects.html">NetworkWorld 5 Cool Cloud Computing Research Projects</a> &#8211; sadly does not mention me&#8230; ;]</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.tim-wood.net/research/2009/06/hot-cloud-2009/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Hot Cloud 2009: The Case for Enterprise Ready Virtual Private Clouds</title>
		<link>http://www.tim-wood.net/research/2009/06/hc09-vpcs/</link>
		<comments>http://www.tim-wood.net/research/2009/06/hc09-vpcs/#comments</comments>
		<pubDate>Mon, 15 Jun 2009 22:52:04 +0000</pubDate>
		<dc:creator>twood</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Papers]]></category>

		<guid isPermaLink="false">http://www.tim-wood.net/research/?p=112</guid>
		<description><![CDATA[The work I presented at Hot Cloud was about what enterprise customers need from cloud computing platforms, and how we can go about building enterprise clouds that are more secure, transparent, and flexible.
You can find a copy of our paper here, and my slides here. Or you can get a summary of our ideas below.
Here [...]]]></description>
			<content:encoded><![CDATA[<p>The work I presented at Hot Cloud was about what enterprise customers need from cloud computing platforms, and how we can go about building enterprise clouds that are more secure, transparent, and flexible.</p>
<p>You can find a copy of <a href="http://www.cs.umass.edu/~twood/pubs/hotcloud.pdf">our paper here</a>, and <a href="http://www.cs.umass.edu/~twood/files/hc09-vpcs.ppt">my slides here</a>. Or you can get a summary of our ideas below.</p>
<p>Here are the three key features we feel are lacking from existing cloud platforms:</p>
<p><strong>Security</strong>: Enterprises need strong security guarantees about the isolation of both the computation and network resources they are getting from the cloud. Existing systems rely on firewall rules for security that must be configured on a per-VM basis. While firewalls are a very powerful form of access control, they are incredibly fine grain and need to be carefully configured. This is a especially a problem in highly dynamic (ie. cloud) environments where new VMs are often being created or moved between servers.</p>
<p><strong>Transparency</strong>: Another problem with cloud computing is that the resources it gives you are completely separated from the systems an enterprise is already running within its data centers. This makes it difficult to deploy applications since you can&#8217;t get the abstraction of having your cloud resources seamlessly connected to your existing LANs within the enterprise.</p>
<p><strong>Resource Flexibility</strong>: There are two issues here. First, existing cloud platforms grant users very limited control over the network resources connected to their VMs.  This means, for example, that it is impossible to do something like reserve a high bandwidth link between a pair of VMs, and certainly not between a VM and the enterprise site that is going to be accessing it. Secondly, cloud platforms are not as flexible as they should be: if you replicate a VM to increase the processing power of an application you need to deal with these security and transparency issues all over again.</p>
<p>To help provide these three features, we propose the idea of a <em>Virtual Private Cloud</em>, that uses VPNs to securely connect groups of VMs within a cloud data center back to the enterprise sites that will use them.  VPNs make it so that the cloud resources are only accessible by other members of the same VPN.  This is a much coarser grain access control mechanism than firewalls, but it is much cleaner and we use MPLS based VPNs that have the benefit of being both highly scalable for enterprises that may run many hundreds or thousands of VMs, and that require no endhost configuration on the VMs &#8212; the VPN is entirely setup at the routers at the cloud and enterprise sites.  Finally, there is the option of using layer 2 VPNs (a Virtual Private LAN Service) to bridge the cloud computing data center and enterprise networks, giving the abstraction that cloud resources are seamlessly connected to the enterprise&#8217;s own LAN.</p>
<p>We are building a system that will implement this sort of system, and are exploring how it can be used to simplify VM migration over the WAN and for providing high availability services capable of seamlessly failing an application over from one cloud data center to another.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tim-wood.net/research/2009/06/hc09-vpcs/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Speeding Up Migration with Page Sharing</title>
		<link>http://www.tim-wood.net/research/2009/05/speeding-up-migration-with-page-sharing/</link>
		<comments>http://www.tim-wood.net/research/2009/05/speeding-up-migration-with-page-sharing/#comments</comments>
		<pubDate>Fri, 08 May 2009 20:24:26 +0000</pubDate>
		<dc:creator>twood</dc:creator>
				<category><![CDATA[Memory]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://www.tim-wood.net/research/?p=77</guid>
		<description><![CDATA[Update: Looking at this now, I&#8217;ve definitely become fully convinced that it is a good idea.  Clearly you need to be a little careful that your pages match up at each end, but as long as you keep an intelligent cache at each end, you definitely should be able to significantly reduce the amount of [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Update</strong>: Looking at this now, I&#8217;ve definitely become fully convinced that it is a good idea.  Clearly you need to be a little careful that your pages match up at each end, but as long as you keep an intelligent cache at each end, you definitely should be able to significantly reduce the amount of migration traffic due to duplicate memory contents.</p>
<p>Kevin Lawton (author of old-school virtualization tool <a href="http://bochs.sourceforge.net/">Bochs</a>), recently wrote an <a href="http://www.virtualization.info/2009/04/r-accelerating-vms-live-migration-by-4.html">article</a> (and <a href="http://www.trendcaller.com/2009/04/boosting-server-utilization-from-60-to.html">followup</a>) on how you could speed up VM migration in a data center by exploiting things like page duplication between the source and destination machines.  The idea is that a lot of VMs have common memory pages that you wouldn&#8217;t actually need to copy over.  He references <a href="http://lass.cs.umass.edu/papers/pdf/VEE09-membuds.pdf">some of my own work</a> that looked at the amount of sharing that actually occurs between VMs, the first (but hopefully not last) time I&#8217;ve ever seen a link to one of my papers in a random blog post in my RSS feed!</p>
<p>The only problem I see has to do with detecting when pages are truly identical. When you do page sharing between VMs on a single host, you detect the similarity by producing a short (32 or 64 bit) hash for each page in memory. If you scan two pages and they produce the same hash, then those pages are <em>very likely</em> identical. I say &#8220;<em>very likely</em>&#8220;, because it could just be a hash collision, and you need to actually scan all the bits in each page before you can truly know that the two pages are identical.</p>
<p>When you do this on a single machine, it&#8217;s not too big a deal to scan two pages and compare their bits, but if you are trying to verify that pages on the source and destination of a migration are identical, it is a big problem. Obviously you can&#8217;t just copy the page over the network to do the comparison, since that is what you were trying to avoid in the first place. I guess the only solution is to use a longer hash value (thus reducing the chance for collisions) and <em>really</em> hope that you don&#8217;t have a malicious VM at the destination that is trying to corrupt your memory by purposefully creating memory pages that will collide with your content.</p>
<p>I think it is a neat idea that would generally work in practice, but you will need a pretty smart cache at each end to make sure your really keeping the pages consistent.</p>
<p>Another idea would be to use a <a href="http://osnet.cs.binghamton.edu/publications/hines09postcopy.pdf">&#8220;post copy&#8221;</a> based approach that tries to get the VM started up on the destination machine as quickly as possible, deferring copying most memory pages until after it has already started.  You might be able to use this to quickly unload a host that is approaching the overload limit, although the migrated VMs may see a larger performance penalty because of how the migration technique works.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tim-wood.net/research/2009/05/speeding-up-migration-with-page-sharing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The End of Desktops (and all your applications)</title>
		<link>http://www.tim-wood.net/research/2009/05/the-end-of-desktops-and-all-your-applications/</link>
		<comments>http://www.tim-wood.net/research/2009/05/the-end-of-desktops-and-all-your-applications/#comments</comments>
		<pubDate>Fri, 08 May 2009 02:25:30 +0000</pubDate>
		<dc:creator>twood</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>

		<guid isPermaLink="false">http://www.tim-wood.net/research/?p=70</guid>
		<description><![CDATA[I have to agree with Google&#8217;s Eric Schmidt about the importance of cloud services on the future of everyday computing. Desktop style applications that run entirely on your own computer don&#8217;t have much life left because 1) people will have too many different devices, so keeping them all synced with local storage is a pain, [...]]]></description>
			<content:encoded><![CDATA[<p>I have to agree with <a href="http://www.techcrunch.com/2009/05/07/eric-schmidt-on-netbooks-forget-android-its-all-about-cloud-services/">Google&#8217;s Eric Schmidt</a> about the importance of cloud services on the future of everyday computing. Desktop style applications that run entirely on your own computer don&#8217;t have much life left because 1) people will have too many different devices, so keeping them all synced with local storage is a pain, 2) it&#8217;s easier for application developers to maintain a single online version of an app instead of dealing with pushing out updates and bug fixes to users, 3) forcing users to go online to get an app prevents piracy, 4) applications can get as much or as little computation and storage power as they need from the cloud, 5) etc.  Sadly, I think it might be #3 that is the real motivation in the end for many companies.</p>
<p>Cloud based services are also better for the environment.  If the average everyday computer can be reduced down to a basic thin client for accessing remote cloud services, that reduces the cost and energy usage of home devices.  The applications running in the cloud can in turn exploit massive degrees of multiplexing to reduce their own energy costs.  As it stands today, most peple&#8217;s computers are far more powerful than they really need to be, and they spend a lot of time sitting around idle.  You might as well give consumers devices that are as simple as possible.  Make them so simple that they don&#8217;t crash all the time.   Make it so they don&#8217;t require technically adept family members to service them a few times a year. Then people will spend less time being frustrated by their computers, and more time able to use the applications running on them.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.tim-wood.net/research/2009/05/the-end-of-desktops-and-all-your-applications/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.428 seconds -->
