Rain Toolkit: An Open Source toolkit for Amazon EC2 administration

Juliano Viana

LogicStyle

Table of Contents

Introduction
Installation and Basic Concepts
Installation
Managing Virtual Machines
Creating Virtual Machines
Virtual Machine startup
Managing running Virtual Machines
Terminating Virtual Machines
Managing Volumes
Creating or labeling volumes
Attaching and detaching volumes
Describing volumes and snapshots
Working with snapshots and backups
Managing IP access permissions

Introduction

Amazon EC2 is a cloud computing service provided by Amazon.com.

Amazon already provides a set of tools to manage EC2 resources. They allow stopping/starting instances, allocating ip addresses and volumes, attaching volumes etc.The problem is , when one starts using the tools to manage day-to-day operations like starting/stopping servers, it becomes clear that the Amazon tools are too fine-grained therefore not very productive.

For instance, starting a server involves calling many different tools: one for running the server instance, one for assigining ip addresses, one to attach EBS volumes etc.Of course one can script this operation, but it is not as easy as it seems: all EC2 operations are asynchronous, which means requests return immediately but may take effect at some point in the future. So at the end shell scripts that use Amazon command-line tools end up very long and complicated and, worse of all, unreliable.Another problem with the command line tools provided by Amazon is that there is no way to label your EC2 objects (instances, volumes, snapshots etc). You have to deal with object ids, which are not intended to be human-readable ( like i-c4c026ac or vol-a5d83ecc)

RainToolkit solves all that by providing a set of command line tools based on the concept of virtual machine, a concept familiar to anyone who has used any virtualization software (VirtualBox, VMWare etc.). The RainToolkit commands allow you to manipulate virtual machines and its attachable components (volumes and elastic IP addresses).

Data for virtual machine and volume configuration is stored in Amazon Simple DB.

Installation and Basic Concepts

Rain toolkit operates on the basic concept of a Virtual Machine (VM). A VM is a set of related EC2 resources (machine image, elastic ip address, volumes etc.) that for administrative purposes is managed as a unit.

A VM has a name and when it is running it also has an associated EC2 instance id. All operations refer to the VM by name.

Rain Toolkit also enables naming of EBS Volumes. Volume operations can refer to the volume name or to the EC2 volume id.

Installation

The most recent RainToolkit distribution can be downloaded from the toolkit web page (http://www.logicstyle.com/raintoolkit.html). The toolkit will work on any Unix-like operating system which has Java 6 installed.(Windows support is forthcoming).

Make sure the java executable can be located in the command path:

java -version
java version "1.6.0_17"Java(TM) SE Runtime 
Environment (build 1.6.0_17-b04-248-10M3025)Java HotSpot(TM) Client VM (build 14.3-b01-101, mixed mode)

Unzip the distribution file in a directory of your choice. Before the toolkit can be use a few environment variables need to be set:

  • RAIN_HOME: this variable should be set to the directory where the toolkit is installed
  • AWS_ACCESS_ID: this variable should be set to your EC2 account access id.
  • AWS_SECRET_KEY: this variable should be set to your EC2 account secret key

You will also need to make a copy of your AWS ssh access keys to the directory $RAIN_HOME/keys, with the name '<key name>.identity'. For instance, if you have a keypair called 'gsg-keypair' you need to save its private key ot the keys directory with the name 'gsg-keypair.identity'

You can test the installation as follows:

export RAIN_HOME=<install dir> 
export AWS_ACCESS_ID=<your aws access id>
export AWS_SECRET_KEY=<your aws secret key> 
cd $RAIN_HOME
./bin/describe-virtual-machines 

Managing Virtual Machines

Creating Virtual Machines

You can create virtual machines with the create-virtual-machine command. It accepts the following arguments:

-name (-n) [String] Virtual machine name
-newName [String] New name  
-image (-i) [String] Virtual machine image id  -availbilityZone (-z) [String] Availability zone 
-groups (-g) [String[,]] Virtual machine security groups, comma-separated 
-kernel (-k) [String] Kernel id  
-ramDisk (-r) [String] Ramdisk id  
-userData (-u) [String] User data  
-staticIpAddress (-s) [String] Static ip address  
-key (-h) [String] SSH key id  
-currentInstanceId (-c) [String] Current instance id  -modify [flag]  
-instanceType (-t) [String] Instance type

In the simplest possible form a Virtual Machine needs a name and a corresponding Amazon Machine Image (AMI). You can use public or private AMIs. You will also probably want to specify a registered ssh key for the VM to use for authentication purposes and specify an instance type:

create-virtual-machine -n myVM -i ami-xxxxxx -key my-key-pair -t SMALL

If the virtual machine you want to create is already running you can associate it with an existing EC2 instance id using the -c flag. You can also optionally associate an elastic IP address and a security group with the Virtual Machine.

Virtual Machine startup

Starting a VM is as easy as:

start-virtual-machine -n <vm name>

This command will start the virtual machine as follows:

  • It will check if the machine is not already running
  • If needed, it will choose an availability zone for the Virtual Machine to start in (based on the VM configuration or on the attached EBS volumes )
  • It asks EC2 to start an instance of the configured AMI
  • Once the instance is running, it asks EC2 to associate the Elastic IP address if present in the VM configuration. It will wait for the association to complete before proceeding.
  • It attaches any EBS volumes associated with the VM, and waits for the attachment to complete before proceeding
  • If there are EBS volumes associated with the VM it then executes SSH comands to mount the EBS volumes in their mount points (authenticating using the identity file for the key unde the 'keys' directory).
  • If there is an auto run command configured for this VM, it will then execute the command using SSH as root
  • It will print to stdout the DNS name of the newly started instance to the standard output and exit with zero status
  • If there is any error during the startup procedure it will print an error message to the standard error output and exit with a non-zero status

After the command completes successfuly the VM is ready for using. Since the tool prints the EC2 public dns name to stdout your shell scripts can capture this information in order to connect to the machine even if its not using an Elastic IP, for example:

set VM_ADDRES=`start-virtual-machine -n myVM`

Managing running Virtual Machines

The command describe-virtual-mahines can be used to display VM status information:

describe-virtual-machines

The describe-virtual-machines command can also be used with the -a flag , which will show detailed information about a single or all virtual machines:

Name                  Uptime        DnsName                                         InternalIpAddress     InstanceId  
test                  none          none                                            none                  none       
website               205d22:20     ec2-174-129-201-163.compute-1.amazonaws.com     10.248.241.236        i-c4c026ac     

Virtual Machine status can also be queried through the vm-get-status command. This command is intended to be used in shell scripts - it will exit with zero status if the virtual machine is running, and non-zero status if its not. In case the VM is running it will also write the public DNS name associated with the VM to the standard output.

Terminating Virtual Machines

To terminate a virtual machine, use the terminate-virtual-machine command:

terminate-virtual-machine -n <machine name>

This command requests the EC2 instance associated to the machine to be terminated and disassociate the instance from the VM.

Managing Volumes

Rain toolkit enables labeling an management of EC2 volumes. Volumes can be associated with Virtual Machines and can be attached and mounted automatically upon Virtual Machine startup.

Creating or labeling volumes

To create a volume, use the create-volume command. Tt accepts the following arguments

-name (-n) [String] Volume name
-size (-s) [Integer] Volume size in Gb
-volume (-i) [String] Existing volume id
-availabilityZone (-z) [String] Availability zone
-snapshot [String] Snapshot id - populates the volume with the given snapshot data  
-newName [String] New volume name (modify only)

Attaching and detaching volumes

EBS volumes are most useful when they are attached to a Virtual Machine. Volumes can be associated ith Virtual Machines through the attach-volume command, which takes the arguments:

-volume (-v) [String] Volume name
-virtualMachine (-n) [String] Virtual machine name
-device (-d) [String] Attach device name  
-mountPoint (-m) [String] Mount point (if the volume should be automatically mounted)  
-mountDevice (-a) [String] Mount device name

Volumes associated with Virtual Machines can be attached and optionally mounted during the startup of the virtual machine. An attachment created with the -m (mount point) option will be automatically mounted, otherwise the volume will just be attahed to the Virtual Machine. This is useful if your instance startup scripts already take care of mounting volumes.

Volumes can be detached using the detach-volume command.

Both attach-command and detach-command commands do not modify the attachment status of running virtual machines, they only modify the metadata information used during VM startup.

Describing volumes and snapshots

Volumes can be described using the describe-volumes :

describe-volumes
VolumeName            VolumeId         CreationTime                 Size     Status        AvailabilityZone     AttachedSince                AttachedTo     AttachedDevice  
website-data          vol-f63cd69f     2009-08-12T11:26:45.000Z     10       in-use        us-east-1b           2009-08-17T16:34:11.000Z     website        /dev/sdh           
website-data-2010     vol-a376a5ca     2010-03-05T09:09:56.000Z     50       in-use        us-east-1b           2010-03-05T09:10:19.000Z     website        /dev/sdi           
backup                vol-549d793d     2008-12-11T23:05:24.000Z     20       available     us-east-1a           n/a                          n/a                  

Attached volumes are also shown by the describe-virtual-machine command when used with the -a argument:

describe-virtual-machines -n website -a
websiteName        Uptime        DnsName      InternalIpAddress     InstanceId     AvailabilityZone     AMI              KeyPair         Kernel      Ramdisk     InstanceType:     StaticIpAddress     SecurityGroups     UserData     AutoRunCommand
website     205d23:14
ec2-174-129-201-163.compute-1.amazonaws.com     10.248.241.236        i-c4c026ac     us-east-1b           ami-a18262c8     gsg-keypair     default     default     SMALL             174.129.201.163     default            none         none               
Volumes
Name                  VolumeId         Device       MountDevice     MountPoint
website-data-2010     vol-a376a5ca     /dev/sdi     /dev/sdi        /data          

Working with snapshots and backups

The command create-snapshot can be used for creating a volume snapshot. describe-snapshot can then be used to describe the status of existing snapshots:

describe-snapshot
SnapshotId        Volume           Status        Progress     StartTime
snap-6619720f     pair-data        completed     100%         2009-12-22T09:45:48.000Z     
snap-c51a71ac     website-data     completed     100%         2009-12-22T09:44:36.000Z     

The command backup-volume is a convenience command for making periodic backups of EBS volumes. It accepts the following arguments:

-id [String] Volume id
-volume (-n) [String] Volume name
-retentionPeriod (-r) [String] Maximum retention period. For instance: 1h (one hour), 2w (2 weeks), 3d (3 days)

The command works as follows: it immediately creates a snapshot of the specified volume and deletes all snapshots of the volume that are older than the specified retention period.

It is indented to be used in cron jobs. For instance, the command below, if run every day, will create a daily snapshot of the chosen volume and will keep the snapshots for 7 days:

backup-volume -n myVolume -r 7d

Managing IP access permissions

The Amazon EC2 security model includes an inbound firewall that can be used to limit access to running instances. This security model allows virtual machines to be managed securely over the Internet but if the managing end has a dynamic IP address then managing the access permissions can become painful.

To help manage this situation, RainToolkit allows the creation of named dynamic ip addresses. These permissions associated with these named addresses can then be updated with a single command.

To create a named dynamic ip address, use the create-dynamic-ip-address command as in the example below:

create-dynamic-ip-address -n myIpName -v wwww.xxxx.yyyy.zzzz

Existing dynamic addresses can be listed with the describe-dynamic-ip-addresses command.

Once a named ip address is created it can be updated with the update-dynamic-ip-address command. This command will take the new ip address provided (or detect the current IP address automatically using the dyndns.org checkip service), search the security rules in all groups for occurrences of the old value of this ip address and replace it with the new value.

Notice that this scheme will only work if there are security rules where the ip address value for this dynamic address appears as a single value, not as part of a network.