2014-11-15

Custom Sensors in Brooklyn

One of the features of Apache Brooklyn is the ability to retrieve data from a running entity using sensors. These sensors expose data from the entity as attributes containing information like queue depth for a message broker, latency for an HTTP server or CPU usage for a Docker host. The data can also be enriched or aggregated to produce sums or moving averages across multiple entities in a cluster, and used as input to policies to drive scaling and resilience mechanisms. The sources for these sensors are varied, and encompass JMX attributes for Java applications, fields from XML or JSON documents returned by RESTful APIs, parsed output from shell commands and many more.

Sometimes the entities provided by Brooklyn do not have the particular piece of data you need for a policy exposed as a sensor. In these circumstances it is possible to dynamically add sensors, either programmatically in a Java entity class that extends the default Brooklyn code, or in the YAML blueprint used to describe the application. There are three different categories of sensor that can be added, SSH command output, JSON data from an HTTP URL or JMX attributes from a Java entity. each of these are configured differently, although there is some commonality.

First, we will look at an example using the SshCommandSensor class to add sensors driven by the output of shell commands invoked over SSH to the virtual machine running the software process being managed. The following blueprint shows a TomcatServer entity and a brooklyn.initializers section adding a sensor to it.

name: Tomcat SSH Sensor
services:
- serviceType: brooklyn.entity.webapp.tomcat.TomcatServer
  name: Tomcat
  location: jclouds:aws-ec2:eu-west-1
  brooklyn.initializers:
  - type: brooklyn.entity.software.ssh.SshCommandSensor
    brooklyn.config:
      name: tomcat.cpustats
      command: "mpstat | grep all"
  brooklyn.config:
    pre.install.command: "sudo apt-get install -y sysstat"

The output of this sensor can be seen in the following screenshot. The SSH command mpstat | grep all is being executed to generate information on CPU usage, which is the published as the tomcat.cpustats sensor.

Problems using SSH

However when trying to perform some calculations on the data from mpstat, using the following YAML fragment to add another sensor, my colleague Richard discovered that the code did not perform as expected. Although the SSH commands appeared to be being executed as expected, the sensor data was always empty.

  - type: brooklyn.entity.software.ssh.SshCommandSensor
    brooklyn.config:
      name: tomcat.cpustats.broken
      command: >
        mpstat |
        awk '$2==\"all\" { print $3+$4+$5+$6+$7+$8+$9+$10 }'

After some investigation, he located the cause of this problem, which is obscure enough that I have decided to document it here for future reference. I will quote from Richard's email on the subject:

The problem is that the LANG environment variable is different between me running SSH in a terminal to test out potential commands, and when Brooklyn is opening SSH sessions to run its commands.

When I was experimenting with finding the right command to run, I would SSH to a Linux box and run variations on my command until I came up with a working version. This SSH session would inherit my workstation's LANG of en_GB.UTF-8. When I ran mpstat, here is a typical line of output:

16:13:15    all   5.40   0.05   2.18   0.52   0.09   0.03   0.00   0.00  91.74

Having got a working command, I plugged it into my blueprint and let Brooklyn run the command. Unfortunately, Brooklyn (probably) does not set the LANG environment variable, and this particular Linux machine chose a default of en_US.UTF-8. When it ran mpstat, here is the equivalent line of output:

04:13:11 PM all   5.40   0.05   2.18   0.52   0.09   0.03   0.00   0.00  91.74

Notice that the date has changed from 24-hour form, to 12-hour with AM/PM suffix. Also note that there is a space before the "PM" suffix - causing all of my awk field numbers to now be off-by-one. D'oh!

So if the output of a command you intend to use with awk, perl or even just cut includes potentially locale-specific items, explicitly set the LANG variable to prevent any surprises in formatting that will throw off your parsing routines. In this case, modifying the command value in the blueprint to set the locale explicitly is achieved by setting LANG to en_US.UTF-8 in the blueprint.

  brooklyn.initializers:
  - type: brooklyn.entity.software.ssh.SshCommandSensor
    brooklyn.config:
      name: cpu.load
      command: >
        LANG=en_US.UTF-8 mpstat |
        awk '$3==\"all\" {print $4+$5+$6+$7+$8+$9+$10+$11}'

Other examples of SSH sensors might be returning the contents of various files in /proc or executing an administration command for an entity to return information. The sensor can be configured to poll at specific intervals, and the output can be coerced to different types, as required. To chnage the poll frequency, set the period configuration key to the time required, either in milliseconds or using a suffix to indicate minutes or seconds, for example 10m for ten minutes or 5s for five seconds. The sensor type is set using the type configuration key, and can be either the name of a primitive type or a fully qualified Java class name. The following example returns the available disk space as the disk.available integer sensor, every five minutes.

  brooklyn.initializers:
  - type: brooklyn.entity.software.ssh.SshCommandSensor
    brooklyn.config:
      name: disk.available
      command: "df / | grep disk1 | cut -d\  -f4"
      period: 5m
      targetType: Integer

Other Sensor Types

The dynamic sensor addition in Brooklyn is not limited to running SSH commands, and there are currently two other mechanisms available. If you have an entity that is running a Java program with JMX enabled, Brooklyn is able to retrieve attributes and convert them into sensor data. Using the JmxAttributeSensor in the same way as for SSH sensors, we can add a dynamic JMX sensor. For example, this YAML snippet adds the LoadedClassCount attribute from the java.lang:type=ClassLoading JMX object as the loaded.classes sensor, refresing every thirty seconds.

  - type: brooklyn.entity.software.java.JmxAttributeSensor
    brooklyn.config:
      name: loaded.classes
      objectName: "java.lang:type=ClassLoading"
      attribute: "LoadedClassCount"
      targetType: Integer
      period: 30s

This JMX object is part of the JVM management information beans, and should be available on every Java application. To access application specific attributes, Brooklyn must already be able to access JMX data on the entity, which will normally be the case if the entity implements the UsesJmx interface. This means that Brooklyn will be able to determine the JMX and RMI or JMXMP ports that are being used, as well as any authentication details that are required, and these are re-used when adding sensors like this. The same period and type keys are used to configure the sensor polling and return value coercion, just as for SSH sensors.

Finally, it is possible to access and parse JSON data froman HTTP based REST API on an entity. This uses the JSONPath expression language to extract parts of a JSON document, which are then returned as the sensor data. An example of the YAML required is shown below. Again, name, period and type have the same meanings as for the other sensor types. The uri key configures the endpoint to access with optional username and password credentials (in future a map of HTTP headers and other features will be added.) If a status of 200 is resturned, the content will be assumed to be a JSON document, and the jsonPath key is used to extract some part of the data as the sensor value. Here we are simply retrieving the value of the sensor field, but it is possible to perform much more sophisticated queries, although this is beyond the scope of this post.

  - type: brooklyn.entity.software.http.HttpRequestSensor
    brooklyn.config: 
      name: json.sensor
      period: 1m
      targetType: Integer
      jsonPath: "$.counter"
      uri: >
        $brooklyn:formatString("http://%s:%d/info.json",
        component("web").attributeWhenReady("host.name"),
        component("web").attributeWhenReady("http.port"))

To give a (contrived) example of how dynamic sensors might be used in an application, imagine that you have a cluster of Couchbase nodes that you wish to scale. Unfortunately the current Brooklyn blueprint already exposes all the useful sensor data you might practically need to use when resizing the cluster, so we must turn to imparctical and useless data instead. Imagine you need to resize based on the amount of disk space used, and ignore for the moment that this is not the right way to scale a cluster. The following blueprint is intended to illustrate the ways in which dynamic sensor data can be used as the input to a policy. The following JSON fragment is part of the data returned by a REST call to the /pool/default endpoint which returns cluster details. We are interested in the usedByData entry, showing hard disk space used by data.

{
    "storageTotals": {
        "hdd": {
            "free": 46188516230, 
            "quotaTotal": 56327458816, 
            "total": 56327458816, 
            "used": 10138942586, 
            "usedByData": 34907796
        }
    }
}

The blueprint below shows how an AutoScalerPolicy might be configured to use this information. We have created a new couchbase.storageTotals.usedByData sensor, which connects to the cluster REST API enpoint specified by the uri, username and password. This returns the pool information and the jsonPath selector extracts the $.storageTotals.hdd.usedBydata path which is coerced to an integer based on the type configuration. The scaling policy uses a $brooklyn:sensor(...) directive to configure its metric key, this looks up our dynamic sensor which will then be compared to the lower and upper bounds to decide whether to resize the cluster. This pattern can obviously be used in your own blueprints to much more useful effect!

name: Couchbase Policy Example
location: jclouds:softlayer:lon02
services:
- type: brooklyn.entity.nosql.couchbase.CouchbaseCluster
  id: couchbase
  adminUsername: Administrator
  adminPassword: Password
  initialSize: 3
  createBuckets:
  - bucket: "default"
    bucket-port: 11211
  - type: brooklyn.entity.software.http.HttpRequestSensor
    brooklyn.config: 
      name: couchbase.storageTotals.usedByData
      targetType: Integer
      period: 1m
      jsonPath: "$.storageTotals.hdd.usedBydata"
      uri: >
        $brooklyn:formatString("%s/pool/default",
        $brooklyn:entity("couchbase").attributeWhenReady("couchbase.cluster.connection.url"))
      username: Administrator
      password: Password
  brooklyn.policies:
  - policyType: brooklyn.policy.autoscaling.AutoScalerPolicy
    brooklyn.config:
      metric: >
        $brooklyn:sensor("brooklyn.entity.nosql.couchbase.CouchbaseCluster",
        "couchbase.storageTotals.usedByData")
      metricLowerBound: 10000000
      metricUpperBound: 50000000
      minPoolSize: 1
      maxPoolSize: 5

Hopefully these examples have given you an idea of the capabilities available when designing Brooklyn blueprints. The aim of Brooklyn is to simplify the autonomic management of applications in the cloud, so the ability to build blueprints from pre-defined components and then extend them without needing to write code is core. It is never possible to anticipate every piece of information that users might need to use in their business logic while building policies for elasticity, scaling or resilence. When creating a blueprint for an application, you can dynamically add sensors to retrieve information from entities using SSH, JMX or HTTP, and wire those sensors into Brooklyn's policy framework.

More detailed documentation is available, and further information can be found at the main Apache Brooklyn site or on GitHub.