Stalking Your Fellow Researchers

At times it seems like the Internet was designed for stalkers–people are increasingly publicizing their personal details on websites like Facebook or Foursquare. While this means that I can get the latest updates from my friends about what they had for lunch, it does not generally help me any with getting the latest news from researchers I’m interested in. RSS (Really Simple Syndication) can help solve this problem by giving you a feed of updates for a given webpage. Unfortunately, most personal websites (that aren’t blogs) don’t support RSS. Luckily, there are some handy tools out there which we can use to turn any website into an RSS feed so that you’ll be notified when it is updated.

The first thing you need is an RSS feed reader–a desktop or web application that will aggregate the  updates from the websites you are watching.  I use Google Reader, but there are quite a few out there.

Next, browse to the website that you are interested in tracking, e.g. the publications page of a researcher in your field (my favorite example).  Copy down the URL of the page and then visit Page2Rss.com.  Paste in the URL of interest, and they will magically produce an RSS feed for you that is updated every time the page is changed.  It will automatically send you only the changed version of the page, a very handy feature if you are using this on a long publication page and only care about the most recent changes. Another option if you don’t like RSS, is to use ChangeDetection.com which will email whenever a page is modified (I haven’t tried this yet).

Of course many websites do provide their own RSS feeds, particularly blog or news sites.  A number of CS publications also provide useful RSS feeds, such as the IEEE and ACM.

Job Application Resources

I’m not going to be applying for jobs until next year, but recently I’ve been helping a few friends with their own applications which has gotten me interested in the subject. To learn a bit more, I went to the Chronicle of Higher Education’s website earlier today.  I’ve heard my dad (a recently retired sociology professor) refer to the Chronicle many times, but this was the first time I’ve actually read any of it.  I found quite a bit of interesting information, which I will link to here.  This is largely for my own benefit a year from now when I need the info, but hopefully some others will find it useful as well.

  • First Time on the Market – a collection of articles on interviews, teaching statements, and generally what to expect
  • How to Write a Teaching Statement – luckily I got some practice with this when taking a course on teaching in scientific disciplines last year, but otherwise it can be a tricky piece to write when you are coming from a graduate program that focuses almost entirely on research
  • Facing the Truth – an interesting piece on the chances of new grads applying to four year teaching colleges without any teaching experience.

Grad Students Officially Obsolete

Robot Scientist 'Adam' at Aberystwyth University

Adam: The first robotic scientist

So much for job security as an academic researcher… soon we’ll all be replaced with giant robots and monkeys on typewriters…

Scientific Publishing

Pretty good comic…

(not written by me)


Improving Data Center Resource Management, Deployment, and Availability with Virtualization

That’s the title of my thesis proposal, which attempts to cram all the work I’ve done over the past four years in just a few words. In the end, I’m pretty happy with the result–I’ve been able to tie together the various projects I’ve worked on to show how virtualization provides powerful new techniques for deploying applications, more efficiently managing resources, and providing high reliability in large data centers.

If you are interested, you can read the full version, or look through my slides.  It should make for absolutely thrilling bed time reading.

Here is the executive summary of what I’ve worked on:

Deployment

I start by looking at the deployment challenges of transitioning to a virtual environment and figuring out where to place VMs. This is an interesting area because virtualization can provide great benefits such as improved server consolidation, but also adds new challenges in the form of virtualization overheads.

MOVE (Modeling Overheads of Virtual Environments)

When you first consider transitioning from running applications natively to using virtual machines, it is important to understand how application resource requirements will change due to the overheads incurred by the virtualization layer. The MOVE project is designed to help predict these resource changes by building a regression model that relates the native and virtual platforms. This was work that I started during an internship at HP Labs in the summer of 2007, working with Lucy Cherkasova.

Memory Buddies – Guiding VM placement with memory information

Once you know your resource requirements, you need to figure out where to put each of your virtual machines.  The Memory Buddies project tries to place virtual machines in order to maximize the amount of memory sharing that can be achieved — if VMs are running similar operating systems or applications, then the virtualization layer can share copies of these duplicated pages. In order to make this practical in a data center with many thousands of VMs, we propose an efficient fingerprinting technique that uses Bloom filters to quickly compare virtual machine memory contents.

Resource Management

Making data centers more efficient is a key concern throughout all of my work.  Virtualization’s greatest benefit comes in the promise of improved server utilization, leading to lower hardware costs and decreased energy consumption.

Sandpiper – automated VM loadbalancing

Alright, now we’ve figured out initial resource allocations and placements for all of our virtual machines, but those initial decisions may not be sufficient (or efficient) if an application’s workload changes over time. Sandpiper is a system which monitors the resource utilization and performance of a set of VMs and dynamically adjusts their resources or migrates them between hosts in order to prevent servers from becoming overloaded. This was the first project I worked on when I came to grad school, and now there are several commercial products out there doing similar things. We recently revised and extended this paper for a journal.

Reliability

High performance systems are only useful if they are reliable. The remaining work for my thesis uses virtualization to decrease the cost of high availability and fault tolerance systems.

ZZ: Cheap Practical Byzantine Fault Tolerance

Byzantine Fault Tolerance is a way of providing very strong reliability guarantees, even in the face of malicious users or application components.  Unfortunately, BFT has a very high cost because each application request must be executed 2f+1 times in order to handle f simultaneous faults. In ZZ, we try to reduce this cost down to only f+1, by using an additional f sleeping VM replicas which are only woken up after a fault is detected.

CloudNet: Wide Area Resource Management and Availability

My most recent work was started while at AT&T in Fall 2008, and looks at how VPNs can be combined with cloud computing platforms to make data center resources appear seamlessly connected to an enterprise’s existing infrastructure. We are further exploring this area to see how we can provide disaster recovery services so that if a data center becomes unavailable, the critical applications running within it can transparently fail over to servers at a different data center.

What Usenix Can Do for Students

After the Usenix ATC welcome session tonight there was a brief Students BoF meeting to discuss what students get out of Usenix (both the organization and its conferences). We talked a fair amount about the idea of student conferences run by students, which I think is a very good idea. The main issue seems to be one of transportation – if it is a national conference, then only people with sufficient funding will be able to get to it, but if it is a regional conference, then only regions with a high density of students (ie the North East and California) will be capable of gathering a big enough crowd.

I still think this is an idea worth pursuing, although it probably works best at the regional level which will sadly leave out a lot of people in the middle of the country. I know that many AI students in my department attend NESCAI (the North East Student Colloquium on Artificial Intelligence held at Cornell each year), and find it very useful since it gives them a chance to practice presenting their work and networking with other people in a low stress environment. It was repeated several times during the discussion that the “hallway track” at Usenix can be the most valuable part, but many students miss out on that because it can be a bit intimidating to strike up conversations, especially with faculty or industry researchers. Giving students opportunities to practice that at a conference just among their peers would be very helpful. For the students helping with conference organization, they would be exposed to reviewing and how program committees work, experience which is normally very hard to acquire as a graduate student.  I don’t think that Usenix would have too much trouble finding students to help organize such a venture, and I’d be tempted to volunteer myself.

On a more broader note, I feel like Usenix currently does a great job in these areas:

  • Technical research: Usenix ATC provides a forum for the presentation of top quality academic and industrial research. I consider it a great venue for any type of general systems work with strong technical components.
  • Mixing industry and academia: In my (relatively limited) experience, Usenix ATC is the conference with closest to an even match between academics and industry professionals. This is good since both sides need the other, but in most other conferences I’ve seen their is a clear majority in one direction or the other.

Other areas that Usenix could expand on to better support students are:

  • Graduate student development: offer tutorials or seminars on topics like research methods or personal organization (ie systems like GTD). A professor in my department teaches a research methods course which was incredibly helpful for me, and I know he has given 1 hour talks on the subject at other schools to rave reviews. These are the kinds of things that graduate students currently are learning on the job through trial and error, and it is much better to just have them taught to you upfront. I’m not sure how well this would fit at something like ATC, but it would definitely be ideal for a student conference, and even just lists of online resources could help.
  • Insights into academia: this would include things like organizing student run conferences or shadow PCs that allow students to get a better idea of what keeps their advisors busy when they aren’t meeting with us. Learning how to review papers helps us become more critical (in a good way) of all the other papers we read, letting us get more out of them than we would otherwise.
  • Realtime research updates: I wish I had a list of blogs written by systems researchers. Usenix could help organize this by at least setting up a list of links to all blog posts written about their conferences (you can start with my notes from HotCloud!). I want to know what other researchers are thinking about, and I also want to be updated whenever people in my area publish new pieces of work (currently I rely on elaborate mechanisms that automatically check the publicaion webpages of the top people in my field to see if they change each day).  Obviously for this to be fully useful, it needs to support more than just Usenix conferences and workshops, and the updates need to be propagated when papers are accepted, not four months later when they are presented.  Usenix’s push into social networks may help with this too, although I’ll admit that I haven’t “friended” Usenix yet, so I’m not sure…

That’s all I can think of for now, and I’m still on east-coast time, so I need to get to sleep.

UMass Thesis Proposal Writing

I’ve started writing my Ph.D proposal recently. It’s pretty exciting because it makes me realize that I actually have accomplished quite a bit since coming here–a good thing, since it has been 4 years already!  I also enjoy writing the proposal because you just need to cover the high level purpose of everything and can ignore all the gritty details. I’m pretty happy with the big picture I’m getting so far.

For future reference, you can get the Latex style file and template for the proposal/thesis from this site.