Developers and roboticists alike use a variety of tools to monitor and diagnose remote systems. One such tool is Amazon CloudWatch, a monitoring and management service that enables users to collect performance and operational data in the form of logs and metrics from a single platform. AWS RoboMaker‘s CloudWatch extensions are open source Robot Operating System (ROS) packages that enable log and metric data uploads to the cloud from remote robots. Many of these robotic systems operate on the edge where Internet access is not reliable or consistent, resulting in interruptions to network connectivity to the cloud. The new offline data caching capability in the RoboMaker CloudWatch ROS extensions provides resilience to network connectivity issues and improves data durability.

This blog post introduces the AWS RoboMaker CloudWatch Logs and Metrics ROS nodes, launched last year at re:Invent 2018, and presents the newly-added offline data caching capability. We will go over the ROS node implementations, the new offline caching behavior, and how to run your nodes on a ROS system.

How it works

The RoboMaker cloudwatch_logger ROS node enables ROS-generated logs to be sent to Amazon CloudWatch Logs. Out of the box, this node provides the ability to subscribe to the /rosout_agg topic, where all logs are published and uploaded to the Amazon CloudWatch Logs service. Logs can be sent to Amazon CloudWatch Logs selectively based on log severity. The cloudwatch_logger node can also subscribe to other topics if logs are not sent to /rosout_agg, and it can unsubscribe from the /rosout_agg topic if that is not needed for obtaining logs.

The RoboMaker cloudwatch_metrics_collector ROS node publishes your robot’s metrics to the cloud and enables you to easily track the health of a fleet of devices, e.g. robots, with the use of automated monitoring and actions for when a robot’s metrics show abnormalities. You can easily track historic trends and profile behavior such as resource usage. Out of the box, it provides a ROS interface to receive ROS monitoring messages and publish them to Amazon CloudWatch Metrics. All you need to do to get started is set up AWS Credentials and Permissions for your robot. The CloudWatch Metrics node can be used with any ROS node that publishes the ROS monitoring message; this message can be used to define any custom data structure you wish to publish with your own node implementation. For example, the AWS ROS1 Health Metrics Collector periodically measures system CPU and memory information and publishes the data as a metric using the ROS monitoring message.

Data flow: robot or edge device to Amazon CloudWatch.

Offline caching

Reliable network connectivity is necessary for the normal operation of services hosted in the cloud. However, for a variety of systems operating on the edge, network connectivity cannot be guaranteed. In order to avoid data loss in such an event, we have added offline data caching for the AWS RoboMaker CloudWatch Logs and Metrics extensions. If a network outage or other issue occurs during uploading, data in flight is saved to disk to be uploaded later (see Figure 1 above for an overview of the data flow).

When network connectivity is restored, upload of the latest / most recent data on disk is attempted first. When all cached data in a file has been uploaded, that file is deleted. An important part of this feature is that the amount of disk space to be used by offline files, their location on disk, and the individual file size are all configurable parameters that can be specified in a ROS YAML configuration file. If the configurable disk space limit has been reached, the oldest cached data is deleted first. For details, please see the project github READMEs: logs and metrics and Figure 2 below.

Offline caching data flow to CloudWatch.

Installation and example

Prerequisites

The AWS RoboMaker CloudWatch ROS nodes are currently supported on ROS Kinetic (Ubuntu 16.04) and Melodic (Ubuntu 18.04), with planned support for ROS2 (CloudWatch-Metrics-ROS-2 & CloudWatch-Logs-ROS-2). To run these nodes you’ll need a working ROS installation, which should be sourced in your current shell. A working ROS installation Docker image can also be used.

AWS credentials

Important: Before you begin, you must have the AWS CLI installed and configured.

cloudwatch_logger node will require the following AWS account IAM role permissions:

[logs:PutLogEvents, logs:DescribeLogStreams, logs:CreateLogStream, logs:CreateLogGroup]

To add the policy to IAM role permissions, copy and create a file with the following JSON policy example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:PutLogEvents",
        "logs:DescribeLogStreams",
        "logs:CreateLogStream",
        "logs:CreateLogGroup"
    ],
      "Resource": [
        "arn:aws:logs:*:*:*"
    ]
  }
 ]
}

And use this AWS CLI command to enable it:

aws iam create-policy --policy-name cloudwatch-logs-policy —policy-document file://cloudwatch-iam-policy.json

If successful, the create-policy command will return a JSON confirmation similar to:

{
    "Policy": {
        "PolicyName": "cloudwatch-logs-policy",
        "PolicyId": "ANPAZOAGRWAWQOE6MYHEE",
        "Arn": "arn:aws:iam::648553803821:policy/cloudwatch-logs-policy",
        "Path": "/",
        "DefaultVersionId": "v1",
        "AttachmentCount": 0,
        "IsAttachable": true,
        "CreateDate": "CURRENT-TIME",
        "UpdateDate": "CURRENT-TIME"
    }
}

The cloudwatch_metrics_collector node will require the cloudwatch:PutMetricData AWS account IAM role permission.

Install CLoudwatch ROS nodes via apt

sudo apt-get update
sudo apt-get install -y ros-$ROS_DISTRO-cloudwatch-logger
sudo apt-get install -y ros-$ROS_DISTRO-cloudwatch-metrics-collector

Note: You can also build from source.

Running the CloudWatch ROS nodes

Logging node instructions

With launch file using parameters in .yaml format (example provided in package source and repo):

roslaunch cloudwatch_logger sample_application.launch --screen

Without a launch file, using default values:

rosrun cloudwatch_logger cloudwatch_logger

Send a test log message:

rostopic pub -1 /rosout rosgraph_msgs/Log '{header: auto, level: 2, name: test_log, msg: test_cloudwatch_logger, function: test_logs, line: 1}'

Verify that the test log message was successfully sent to CloudWatch Logs:

  • Go to your AWS account.
  • Find CloudWatch and click into it.
  • In the upper right corner, change region to Oregon if you launched the node using the launch file (region: "us-west-2"), or change to N. Virginia if you launched the node without using the launch file.
  • Select Logs from the left menu.
  • With launch file: The name of the log group should be robot_application_name and the log stream should be device name (example with launch file below).
  • Without launch file: The name of the log group should be ros_log_group and the log stream should be ros_log_stream.

After sending a few logs via the command line, CloudWatch’s logs console will look similar to this:

CloudWatch logs console example.

Note: To understand configuration parameters, please see: parameter’s description and sample configuration file.

Sending logs from your own node

As long as your node publishes to /rosout_agg, logs will be automatically picked up.

Here’s an example with a turtlesim node:

In separate terminals run the turtlesim nodes below, after launching the cloudwatch_logger node as directed in previous example.

rosrun turtlesim turtlesim_node

rosrun turtlesim turtle_teleop_key

TurtleSim CloudWatch test example.

Note: if you have a topic that publishes messages of type rosgraph_msgs::Log you can also upload them to CloudWatch by adding your topic to topics line on the sample_configuration.yaml.

Metrics node instructions

With a launch file using parameters in .yaml format (example provided in package source and repo):

roslaunch cloudwatch_metrics_collector sample_application.launch --screen

Send a test metric:

rostopic pub /metrics ros_monitoring_msgs/MetricList '[{header: auto, metric_name: "ExampleMetric", unit: "sec", value: 42, time_stamp: now, dimensions: []}, {header: auto, metric_name: "ExampleMetric2", unit: "count", value: 22, time_stamp: now, dimensions: [{name: "ExampleDimension", value: "1"}]}]'

Note: See the GitHub repo to understand the configuration file and parameters.

You can also use the bash script below to send a stream of metrics messages:

./metrics-script.sh metric_script_pub_demo 1 24

After publishing 24 sample metrics with the bash script below, the AWS Metrics Console will look something like this:

CloudWatch metrics console example.

An example shell script to publish multiple metrics messages:

#!/bin/bash
# Simple script to publish metrics

if [ $# -eq 0 ] || [ "$1" == "h" ] || [ "$1" == "--help" ]; then 
    echo "This script will publish a ros_monitoring_msgs/MetricList to /metrics." 
    echo "Please provide the metric name, metric value lower bound, and metric value upper bound."
    echo "Example: ./metric_publisher my_demo_metric 1 42"
    exit 0
fi

if [ "$1" == "" ]; then 
    echo "Please provide the metric name"
    exit 1
fi

if [ "$2" == "" ]; then 
    echo "Please provide the metric value lower bound"
    exit 1
fi

if [ "$3" == "" ]; then 
    echo "Please provide the metric value upper bound"
    exit 1
fi

echo "Publishing metrics with metric_name=$1, value start=$2, value end=$3"

for i in $(seq "$2" "$3")
do
  echo "Publishing metric $1 $i"
  rostopic pub -1 /metrics ros_monitoring_msgs/MetricList "[{header: auto, metric_name: '$1', unit: 'sec', value: $i, time_stamp: now, dimensions: [{name: 'Example-Script-Pub-Dimension', value: '$i'}]}]"
done

Summary

In this post we reviewed how you can publish log and metrics data from a ROS node to AWS CloudWatch and how the new offline caching feature works, with hands-on examples on how to use the nodes. This is a rich set of features that both developers and robot fleet managers can use to log, review, and experiment with robot-generated metrics and logs. We hope you have found this post useful; for any improvements or feature requests, please feel free to file a ticket to our logs or metrics repositories, or if you would like to contribute, open a pull request and we will review!

from AWS Open Source Blog