2013-03-10

Tidying Up After Jclouds

Problem

In our corporate AWS account we had several thousand jclouds generated key-pairs and security groups in the most frequently used regions, and this was causing us problems. So, I created a simple command-line utility to detect and delete the unwanted objects.

The utility is written in Java and uses the jclouds library to talk to EC2, although there are other mechanisms that would work equally well such as a shell script that calls the EC2 API tools. The utility will be a simple command-line program, with a main(String...argv) method that will connect to EC2 and perform the required cleanup. The utility will need to know the AWS credentials to use, and which region to connect to. I also decided to make the pattern to match the names agains configurable, but to default to matching jclouds generated objects.

Quick Fix

If you want to use this immediately, you can access the GitHub repository at grkvlt/ec2cleanup. The commands to clone the repository and build the program on a Linux or OSX system are shown below:

% git clone https://github.com/grkvlt/ec2cleanup.git
...
% cd ec2cleanup
% mvn clean install
...

Once you have built everything, execute the program as follows, substituting your AWS credentials as appropriate:

% java -Daws-ec2.identity=AAAAAAAAAAAAAAAAAAAA \
    -Daws-ec2.credential=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX \
    -jar target/ec2cleanup-0.1.0-SNAPSHOT-jar-with-dependencies.jar us-east-1
[INFO] Cleaning SecurityGroups and KeyPairs in aws-ec2:us-east-1 matching 'jclouds#.*'
[INFO] Found 2585 matching KeyPairs
[INFO] Deleted 2585 KeyPairs
[INFO] Found 2342 matching SecurityGroups
[INFO] Deleted 2342 KeyPairs

Overview

When you use the jclouds library with Amazon's EC2 compute cloud, by default it creates a security group and a key-pair for your instance. However, if you are not careful to clean up after yourself, or if the instance is long-running and not terminated by jclouds, then these objects will remain in your AWS account long after the instance that used them has been destroyed. In production usage a specific named key-pair and security group will normally be created for use, and the defaults will not be a problem. However, running automated tests or similar processes will often result in many of these unused objects cluttering up your account, and possibly slowing down your other jclouds code when they are enumerated or even triggering the AWS rate-limiter. This is a known problem with jclouds, see issues 364 and 365.

The simplest way to dicover these is by looking at their names. In general they will begin with the string jclouds# followed by a unique identifier, for example:

jclouds#0eca9048cc1e47659736c236fbbdb6c2
jclouds#03230716fff7412cb7241c78de86e99d#eu-west-1

To get rid of them, we can simply enumerate all key-pairs and security groups, filter out those whose names match this pattern, and delete them. This is simple enough, but I worried about deleting objects that were in use. As it turns out, this is not a problem for key-pairs - they are only needed at the time the instance is created, after that it is the responsibility of the user wishing to connect to retain the correct keys. For security groups, the API will not allow deletion of a group in use by an active instance. This means that, as long as jclouds is not currently creating any VMs, we are free to try and delete things based on name alone.

The code will use Maven to manage dependencies, and will also enable us to generate an executable Jar file that includes all required run-time libraries. The current version of jclouds is 1.5.8, and the AWS EC2 provider is added as a dependency to the POM file like this:

<dependency>
    <groupId>org.jclouds.provider</groupId>
    <artifactId>aws-ec2</artifactId>
    <version>1.5.8</version>
</dependency>

We will use only those libraries included by jclouds, however that includes both Google Guava 13 and SLF4J which should be all we need for this project.

Code

To connect to AWS using EC2 in jclouds we need a RestContext object, which is created as follows:

ImmutableSet<Module> modules = ImmutableSet.<Module>of(new SLF4JLoggingModule());
RestContext<EC2Client, EC2AsyncClient> context = ContextBuilder
                .newBuilder("aws-ec2")
                .credentials(identity, credential)
                .modules(modules)
                .build();

This context object allows access to the EC2Client interface which in turn will expose the KeyPairClient and SecurityGroupClient EC2 APIs. Since we are not using any Amazon-specific API calls, we only need the generic EC2 API clients.

The key-pair and security group clean up code will look almost identical, so I will show only the logic used for key-pairs. First, we need to find all key-pairs in a specific region:

Set<KeyPair> keys = keyPairApi.describeKeyPairsInRegion(region);

This gives us a collection of KeyPairs. To delete a key-pair we need to know only its name and region, so we can now use the Guava collections framework to transform this into a set of names:

Iterables.transform(keys, new Function<KeyPair, String>() {
    @Override
    public String apply(@Nullable KeyPair input) {
        return input.getKeyName();
    }
});

Then, we select the names matching our regular expression:

Iterables.filter(names, new Predicate() {
    @Override
    public boolean apply(@Nullable String input) {
        return input.matches(regexp);
    }
});

This can be done even more succinctly using the Guava Predicates class, and the containsPattern(String) predicate. However, this predicate uses pattern.matcher(input).find() to check if the input contains the specified pattern. To match on the whole string, we must anchor the regular expression to the start and end of the string using ^pattern$ as follows:

Iterables.filter(names, Predicates.containsPattern("^" + regexp + "$"));

This final filtering gives an Iterable with the names of the key-pairs to delete. We can then iterate through this, deleting each one with another jclouds API call:

for (String name : filtered) {
    keyPairApi.deleteKeyPairInRegion(region, name);
}

Putting these blocks of code together for both key-pairs and security groups, along with some logging to indicate how things went and some error checking to prevent a single failure stopping the whole process, gives us the Ec2CleanUp.java class. The class also contains the simple command-line parsing to obtain the region and regular expression to use. The defaults are to use the AWS Europe region and match jclouds objects:

private static final String AWS_EUROPE = "eu-west-1";
private static final String JCLOUDS_NAME_REGEXP = "jclouds#.*";

Summary

The quick fix section at the start of this post shows how to download and build the program, and the README file in the GitHub repository gives multiple usage examples:

grkvlt/ec2cleanup

The code is licensed under the MIT License so you can do whatever you want with it. Please fork the repository and issue a pull request if you have any fixes improvements, and let me know in the comments here.

Hopefully this is a useful utility if you have problems with too many jclouds objects in your AWS account, and will help tidy things up. It should also serve as a template guide for building other jclouds-based programs to accomplish simple tasks in the cloud.