My AWS Musings

Cloud computing, EC2, RDS, SQS, S3, Java…

Why Amazon’s new AWS java sdk sucks

I am an ardent fan and user of Amazon’s AWS and that is the reason I don’t like their new API. Amazon has so far done a great job of making their services very intuitive, simple and easy to use. But somehow they forgot their principals while designing the sdk. In this post I am planning state my case.

Let’s discuss an example. The example is in groovy, but I am sure the code can be understood by everybody:


def registerInstance(String instanceId, String loadBalancerName) {
  def request request = new RegisterInstancesWithLoadBalancerRequest()
  def server = new com.amazonaws.services.elasticloadbalancing.model.Instance()
  server.setInstanceId(instanceId)
  request = request.withInstances([server]).withLoadBalancerName(loadBalancerName)
  def client = new AmazonElasticLoadBalancingClient(awsCredentials)
  client.registerInstancesWithLoadBalancer(request)
  logger.info "${instance.getPublicDnsName()} registered with ${loadBalancerName} load balancer"
}

Above code simply registers a given instance with a load balancer. Now let’s try to achieve the same thing using a different API available on Google code. It’s called Typica and is written by dkavanagh and two other persons.


def registerInstance(Stirng instanceId, String loadBalancerName) {
  def loadBalancing = new LoadBalancing(accessKey, secretKey)
  loadBalancing.registerInstancesWithLoadBalancer(loadBalancerName, [instaceId])
}

Now which one is easier to read? Which one is faster to code? Which is more intuitive? Obviously the second one. I simply don’t understand the reason for long class names they have throughout the API and the request and result pattern. Every single method in Amazon’s sdk take a request object and return a result object. You are forced to create these extra long name objects! Thank god I did not write the code in java here, otherwise it wouldn’t have been even bigger as in Java you have to repeat a class name twice in a line if you want to create an object. The whole api is full of such examples.

I am not the only one screaming over the API. Steve Jin has expressed similar concerns about the API in his DoubleCloud Blog. According to Steve the API lacks consistency, clear object model and the structure of the API is flawed.

Hope enough people scream over the Internet so that Amazon can hear it.

Amazon AWS Java SDK released

Amazon recently announced the AWS SDK for java.

SDK or a java api is very much needed – especially if you are writing your automation scripts in groovy. We have tried multiple java apis in our scripts including JetS3t and Typica. These apis were really helpful, but they only supported some of the AWS services and were not up to date (for obvious reasons). Having one java api that can support all of AWS technologies was definitely the need of the hour. I am sure Amazon will keep it updated as new services are released. They have the necessary resources to do so.

Furthermore, Amazon has also uploaded the SDK to the maven repository.
You can use the following dependency in your pom.xml:

<dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-java-sdk</artifactId>
    <version>1.0.002</version>
</dependency>

The java doc for the SDK is hosted at http://docs.amazonwebservices.com/AWSJavaSDK/latest/javadoc/index.html

Amazon has also opened the SDK source code for all. They have mirrored the SDK code repository at github. You can look at the SDK code at http://github.com/amazonwebservices/aws-sdk-for-java

Amazon Simple Notification Service – an easy messaging system in the cloud

Amazon recently announced a new service called Simple Notification Service.
It provides a cheap publish/subscribe messaging system in the cloud. You can learn how to use it by visiting http://docs.amazonwebservices.com/sns/latest/gsg/

I played with it and found that the service is really easy to use, very robust and very extensible. It basically makes a publish/subscribe messaging service similar to JMS available in the cloud. Having a cheap, robust publish/subscribe messaging system can serve many purposes in the cloud. In an auto scaled environment servers go up and down depending upon traffic. Here are some of the usages I can think of in our environment:

Click here to continue reading…

Web serving in the cloud – our experiences with nginx and instance sizes

We have been doing various experiments in our ec2 web serving cluster to serve maximum traffic at the minimum costs. I thought our experience will be useful to many other people using ec2.

We have a web application with nginx + tomcat. We approximately get 200k requests per minute at the peak and about 65K requests per minute at night. Since we host a webservice and not a webpage, most of our requests are servlet requests (and not the faster file serving, nginx based requests).

Click here to continue reading…

7 easy tips to reduce your Amazon ec2 cloud costs

Amazon ec2 costs can grow very fast if you are not mindful of the Amazon ec2 billing structure. We came across the following ways to save money at our company.

Keep machines in the same availability zone
Don’t scatter your machines that talk to each other across multiple availability zones. You will end up paying for the bandwidth. Off course this does not apply to the people who purposely keep their machines in different availibilty zones to ensure high availibility.

Use cnames instead of A records
In other words do not map your domain name to an elastic ip. Map it to the public domain name of the instance. Let’s say you have a machine with name splunk.gumgum.com. All the web servers (within the same availability zone) send considerable data to this machine. If you setup splunk.gumgum.com as an A record, your data will go out and come back in. But if you map it as cname, your data will always remain within the ec2 cluster. To read more about this visit Eric Hammond’s Using Elastic IP to Identify Internal Instances on Amazon EC2 post.

Click here to continue reading…

Amazon Relational Database Service (RDS) – The Timezone Problem

The default time zone of your RDS database instance is UTC. It simply can not be changed.

RDS does not give you super privileges. That is why you won’t be able to change the global time zone by simply executing:

SET GLOBAL time_zone = 'US/Pacific';

Click here to continue reading…