My AWS Musings

Cloud computing, EC2, RDS, SQS, S3, Java…

HBase on EC2 using EBS volumes : Lessons Learned

We started using HBase on EC2 sometime back in 2009. We thought that our data is important and we should have an option of restoring the data. We attached EBS volumes to our HBase nodes and configured HBase and Hadoop installation to store all the data on the attached EBS volumes.

Then came the concept of EBS backed instances. In those days we were still experimenting and HBase was releasing new versions very frequently. We were already few versions ahead pf our original AMI for Hadoop and HBase. We were also in the process of tuning our HBase/Hadoop cluster. The process of documenting all the changes after the changes are done to the installation or creating a new image everytime you changed something was very cumbersome. Instead, we thought if we converted our nodes to EBS backed instances, we won’t have to do any of it. We simply have to take a snapshot of the root device and then restore it incase the volume fails.

And this worked happily for few months. One day it suddenly stopped working.

There are many wayas to restore EBS backed instances from their snapshots. Here are all of the ways I knew:
1) Register the snapshot as an AMI and start an instance from the image.
2) Create a volume from your snapshot. Start a similar EBS backed instance, stop the instance and swap the root device.
3) Create an AMI from a running instance. This causes the instance to reboot immediately. It wasn’t an option for us. There is no way were could afford to reboot our master!

You have to know kernel and ramdisk ids if you want to go for option 1 and 2. You may think it’s a no brainer – just use the meta data query tool and find out kernel and ramdisk of the running instances. But not all instances have that meta data available to them! Our instances did not have a ramdisk meta data available! When we contacted Amazon support they told us that the instance is very old and there is simply no way to know which ramdisk it is using. That means you need to choose a ramdisk yourself. If the kernel or ramdisk you are using to create AMI from the snapshot is not compaitable, your instance will not boot up correctly. And this is especially true in case of Ubuntu images.

That’s what happened with us. It stopped working – somehow the kernel files were not available. Even though ramdisk information was not available, it was the kernel that caused us a problem. Here is what Amazon support had to say on our problems:

“Your practice of taking snapshots and starting instances from those machines can work, as it has in the past, but will always be susceptible to kernel/ramdisk mismatches.”

“Our standard practice of creating an image (AMI) from a running instance (option 3 as described above) and launching instances from that AMI would avoid the problem you’re seeing with the mismatched/incompatible kernels.”

When we told Amazon that it’s not an option for us as it causes the instance to reboot immidiately, here is what they suggested:

“Have you considered writing data to an EBS volume that is separate from your root EBS volume? I’m just wondering if that’s a viable option as it wouldn’t require stopping or rebooting the instance.”

There lies the answer! We have a requirement of recreating the cluster in case we accidently delete entire data or if we loose our master. In such a case the reliable backup can only be taken if your HDFS data does not reside on the root devices. A reliable backup of the root device cannot be taken without rebooting the device. Furthermore it’s stored as an AMI which mean you have to create a new AMI every day and delete the old one. This means to solve all of our problems we need HBase installation and data both stored on attached EBS volumes that are not the root devices.

It was news to us.

We had no choice. We decided to invest time to convert our architecture to use attached EBS volumes rather than waking up in the middle of a night and realizing that we are not able to restore our backup!

Manage EBS snapshots with a python script

I was looking for a simple script that creates a new ebs snapshot and deletes all the previous snapshots except a few newest snapshots. I found a script written in php called manage snapshots at http://www.thecloudsaga.com/aws-ec2-manage-snapshots/. But the script only deletes snapshots. It does not create a new snapshot. That is why I decided to write a script on my own.
Click here to continue reading…

Architecting for Cloud

A sunny Sunday morning. I am preparing to go out with my wife. And suddenly a pingdom alert comes to my phone. The website is down! I run to my ubuntu desktop and hurriedly open ylastic, amazon aws console, splunk etc. Console shows nothing unusual. The instance on which the website is running is shown as ‘up’. I try to ssh the instance with its public dns name. I can’t get to it! The security groups are in place. Then what happened? Why is the instance not accessible?
Click here to continue reading…

Web Service version of the EC2 Instance Metadata Query Tool

I think most of the EC2 users know about EC2′s instance metadata query tool. It’s an executable you can install on your ec2 instance. Once you save the file and make it executable you can get metadata of the instance such as hostname, block-device-mapping, instance-id etc.

But there is a web service version of this metadata tool too. This version is not so much advertised. Whenever I google ‘aws metadata query tool’, I never find it. I like the webservice version better becasue I don’t have to go through installation steps everytime I want get metadata of an instance.

Click here to continue reading…

Choosing the right metrics for autoscaling your ec2 cluster

At GumGum we are using autoscaling successfully. Choosing the right metrics for autoscaling is an ongoing process as your cluster and applications change. When we researched which metrics to use for autoscaling we found very little literature in the blogosphere. That is why I decided to document our experiences with it.

Click here to continue reading…

7 Tips for running HBase in EC2

We are running a HBase (Currently 0.20.4) cluster on ec2. I thought it will be useful for others to know some tips about running HBase in EC2.

1) Use private dns addresses in config files such as hdfs-site.xml, hbase-site.xml. On ec2 Ubuntu instances java’s getHost() gets resolved to the private dns addresses.

2) Use c1.xlarge or bigger node to start with. I have seen Andrew Purtell (HBase committer) recommending this on HBase mailing list. We have tried m1.large machines. It has worked well for us when our traffic was small. We are hitting HBase in real time. As traffic started increasing we started getting CPU maxouts. Currently we use c1.xlarge machines.

Click here to continue reading…