Category: Uncategorized

Docker: Ready! Steady! Deploy!

Docker: Ready! Steady! Deploy!

How often have you had to set up a server environment to deploy your application? Probably more often than you would like.

At best, you had a script performing all the operations automatically. At worst, you had to:

  • Install the D database of the x.x.x version;
  • Install the N web server of the x.x version;
  • and so on…

Over time, environmental management configured in this way becomes very resource-intensive. Any minor change in configuration means at least that:

  • every member of a development team should be aware of this change;
  • this change should be safely added to the production environment.

It is challenging to track changes and manage them without special tools. One way or another, problems with the configuration of environment dependencies arise. As development continues, it becomes more and more challenging to find and fix bottlenecks.

I have described what is called vendor lock-in. This phenomenon becomes quite a challenge in the development of applications, in particular, server-type ones.

In this article, we will consider one of the possible solutions — Docker. You will learn how to create, deploy, and run an application based on this tool.

ÐаÑÑинки по запÑоÑÑ Docker

Disclaimer: This is not a review of Docker. This is the first entry point for developers who plan to deploy Node.JS applications using Docker containers.

While developing one of our projects, my team faced a lack of detailed articles on the topic, and this greatly complicated our work. This post was written to simplify the pathways of colleague developers who will follow our footsteps.

What is Docker?

Simply put, Docker is an abstraction over LXC containers. This means that processes launched using Docker will see only themselves and their descendants. Such processes are called Docker containers.

Images ( docker image ) are used for creating abstractions based on such containers. It is also possible to configure and create containers based on Docker images.

There are thousands of ready-made Docker images with pre-installed databases, web servers, and other important elements. Another advantage of Docker is that it is a very economical (in terms of memory consumption) tool since it uses only the resources it needs.

Let’s Get Closer

We will not dwell on the installation. Over the past few releases, the process has been simplified to a few clicks/commands.

In this article, we will analyze the deployment of a Docker application using the example of a server-side Node.js app. Here is its primitive, source code:

// index

const http = require('http');

const server = http.createServer(function(req, res) {

res.write('hello world from Docker');



server.listen(3000, function() {

console.log('server in docker container is started on port : 3000');


We have at least two ways to package the application in a Docker container:

  • create and run a container using an existing image and the command-line-interface tool;
  • create your own image based on a ready sample.

The second method is used more often.

To get started, download the official node.js image:

docker pull node

The “docker pull” command downloads a Docker image. After that, you can use the “docker run” command. This way you will create and run the container based on the downloaded image.

docker run -it -d –rm -v “$ PWD”: / app -w = / app -p 80: 3000 node node index.js

This command will launch the index.js file, map a 3000 port to an 80 port, and display the ID of the created container. Much better! But you will not go far with CLI only. Let’s create a Dockerfile for a server.

FROM node


RUN cp . /app

CMD ["node", "index.js"]

This Dockerfile describes the parent image of the current version, as well as the directory for starting container commands and the command for copying files from the directory where the image assembly is being launched. The last line indicates which command will run in the created container.

Next, we need to build an image that we will deploy based on this Dockerfile: docker build -t username / helloworld-with-docker: 0.1.0.   This command creates a new image, marks it with username helloworld-with-docker   and creates a 0.1.0 tag.

Our container is ready. We can run it with the docker run  command. The vendor lock-in problem is solved. The launch of the application is no longer dependent on the environment. The code is delivered along with the Docker image. These two criteria allow us to deploy the application to any place where we can run Docker.


The first part is not as tricky as the remaining steps.

After we have completed all the instructions above, the deployment process itself becomes a matter of technics and your development environment. We will consider two options for deploying Docker:

  • Manual deployment of Docker image;
  • Deployment using Travis-CI.

In each case, we will consider delivering the image to an independent environment, for example, a staging server.

Manual Deployment

This option should be chosen if you do not have a continuous integration environment. First, you need to upload the Docker image to a place accessible to the staging server. In our case, it is a DockerHub. It provides each user with a free private image repository and an unlimited number of public repositories.

Log in to access the DockerHub:

docker login -e [email protected] -u username -p userpass

Upload the image:

docker push username/helloworld-with-docker:0.1.0

Next, go to the staging server (Docker must be already preinstalled on it). To deploy the application on the server, we need to execute only one command:  docker run -d --rm -p 80: 3000 username / helloworld-with-docker: 0.1.0 . And that’s all!

Check the local register of images. If you don’t find your image, enter username helloworld-with-docker  to check the DockerHub registry. An image with the name you indicate must be in the register since we have already uploaded it there. Docker will download it, create a container on its basis, and launch your application in it.

From this moment, every time you need to update the version of your application, you can make a push with a new tag and just restart the container on the server.

P.S. This method is not recommended if Travis-CI is available.

Deployment Using Travis-CI

First, we should add DockerHub data to Travis-CI. They will be stored in environment variables.

travis encrypt [email protected]

travis encrypt DOCKER_USER=username

travis encrypt DOCKER_PASS=password

Then we should add the received keys to the .travis.yml file. We should also add a comment to each key in order to distinguish between them in the future.



    - secure: "UkF2CHX0lUZ...VI/LE=" # DOCKER_EMAIL

    - secure: "Z3fdBNPt5hR...VI/LE=" # DOCKER_USER

    - secure: "F4XbD6WybHC...VI/LE=" # DOCKER_PASS

Next, we need to login and upload the image:


  - docker login -e $DOCKER_EMAIL -u $DOCKER_USER -p $DOCKER_PASS

  - docker build -f Dockerfile -t username/hello-world-with-travis.

  - docker tag username/hello-world-with-travis 0.1.0

  - docker push username/hello-world-with-travis

Also, image delivery can be launched from Travis-CI in various ways:

  • manually;
  • via SSH connection;
  • online deploy services (Deploy Bot, deployhq);
  • AWS CLI;
  • Kubernetes;
  • Docker deployment tools.


We have considered two ways (automatic and via Travis-CI) of preparation and deployment of Docker using an example of a simple node.js server. The knowledge gained should greatly simplify your work. Thanks for your time!

from DZone Cloud Zone

Fault Tolerance And Redundancy For Cloud Computing

Fault Tolerance And Redundancy For Cloud Computing

Cloud computing is now the foundation of many business solutions for a number of significant business-critical reasons, one of which is the fact that cloud computing—and the cloud ecosystem as a whole—is a lot more reliable (and easier to maintain) than on-premise hardware implementations. Clusters of servers that form today’s best cloud environments are designed from the ground up to offer higher availability and reliability.

AWS, for example, when leveraged to its fullest capabilities, offers high availability. You also have the option to set up redundancies and take the reliability of your cloud environment to a whole other level. It is worth noting, however, that hardware failure is still a risk to mitigate, even with Amazon’s robust AWS ecosystem. This is where fault tolerance and redundancy are needed.

Fault Tolerance vs. Redundancy

Before we can discuss how a cloud environment can be made more reliable, we need to first take a closer look at the two approaches to do so: increasing fault tolerance and redundancy. The two are often seen as interchangeable or completely similar, but fault tolerance and redundancy actually address different things.

Redundancy is more for components or hardware. Adding multiple EC2 instances so that one can continue serving your microservice when the other powers down due to a hardware issue is an example of using redundancy to increase your solution’s availability. Servers, disks, and other components are made with multiple redundancies for better reliability as a whole.

Fault tolerance, on the other hand, focuses more on systems. The cloud computing networks, your S3 storage buckets, and even Amazon’s own services such as Elastic Load Balancing (ELB) are made to be fault-tolerant, but that doesn’t mean all AWS services and components are the same. You still need to treat services like EC2 seriously to increase availability.

AWS Isn’t Perfect

One big advantage of using AWS when it comes to availability and fault tolerance is the fact that you can leverage Amazon’s experience in handling disasters and recovering from them. Just like other cloud service providers, Amazon deals with outages caused by power problems, natural disasters, and human error all the time. They’ve gotten good at it, too.

Amazon’s 99.5% SLA is about as good as it gets. You can still expect several hours of downtime every year, but that’s not a bad standard at all. In fact, Amazon is leading the market with better redundancy and multiple layers of protection to maintain availability.

Nevertheless, it is worth noting that AWS isn’t perfect. Relying entirely on the robustness of AWS and its services isn’t the way to create a reliable and robust cloud ecosystem. You still need to configure the different instances correctly and use services to strengthen your system from the core. Fortunately, there are a number of things you can do.

Improving Reliability

Fault tolerance in cloud computing is a lot like the balance of your car. You can still drive the car—with limitations, of course—if one of the tires is punctured. A fault-tolerant system can do the same thing. Even when one or some of its components stop working due to an error, the system can still deliver its functions or services to a certain extent.

Designing a system for maximum fault tolerance isn’t always easy, but AWS offers a number of tools that can be used to boost reliability, starting with AMI. When you begin setting up the system by creating an AMI that works, you have the option to start a new instance using the same AMI should another fail.

Another way to add fault tolerance is by using EBS, or Elastic Block Storage. Problems associated with your drives running out of storage space can be effectively mitigated using EBS. The use of EBS also allows you to attach different EC2 instances, meaning you can switch from a failing EC2 instance to another without switching storage.

Since all configurations and data are stored in the same EBS, you are basically keeping the system running, despite replacing the EC2 instance used by the system. You can take this a step further by introducing Elastic IP address, which allows for multiple EC2 instances to use the same IP address; this eliminates the need for DNS zone updates or reconfiguration of your load-balancing node.

A System That Scales

One more thing to note about fault tolerance and redundancy in AWS: you have plenty of auto-scale options to utilize. Many AWS services can now be scaled up (or down) automatically. The others support scaling when triggered.

Combined with services like EBS, you can create a fault-tolerant system rather easily in AWS. Adding multiple redundancies to further support the system will result in a capable and immensely reliable system.

With the market being as competitive as it is today, offering highly available services to users becomes an important competitive advantage. Downtime is unacceptable when your competitors are always available. Increasing fault tolerance and adding redundancies are how you stay ahead of the market. For more on optimizing your cloud ecosystem on AWS, read our article Optimizing DevOps and the AWS Well-Architected Framework.

This post was originally published here.

from DZone Cloud Zone

Calling Lambda Functions Through AWS API Gateway

Calling Lambda Functions Through AWS API Gateway

In recent times, most people are moving towards FaaS (Functions-as-a-Service). This article shows how to write a Lambda service in AWS and to call it through the AWS API gateway.

How to Write a Lambda Function Using Java

  • Create a Java Maven project.
  • Edit the Pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns:xsi=""

  • Create a class named “”

import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
import org.json.simple.parser.ParseException;

import java.util.Collections;
import java.util.HashMap;
import java.util.Map;

public class APIRequestHandler implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {
    public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent apiGatewayProxyRequestEvent, Context context) {

        APIGatewayProxyResponseEvent apiGatewayProxyResponseEvent = new APIGatewayProxyResponseEvent();
        try {
            String requestString = apiGatewayProxyRequestEvent.getBody();
            JSONParser parser = new JSONParser();
            JSONObject requestJsonObject = (JSONObject) parser.parse(requestString);
            String requestMessage = null;
            String responseMessage = null;
            if (requestJsonObject != null) {
                if (requestJsonObject.get("requestMessage") != null) {
                    requestMessage = requestJsonObject.get("requestMessage").toString();
            Map<String, String> responseBody = new HashMap<String, String>();
            responseBody.put("responseMessage", requestMessage);
            responseMessage = new JSONObject(responseBody).toJSONString();
            generateResponse(apiGatewayProxyResponseEvent, responseMessage);

        } catch (ParseException e) {
        return apiGatewayProxyResponseEvent;

    private void generateResponse(APIGatewayProxyResponseEvent apiGatewayProxyResponseEvent, String requestMessage) {
        apiGatewayProxyResponseEvent.setHeaders(Collections.singletonMap("timeStamp", String.valueOf(System.currentTimeMillis())));

How to Create a Lambda Function in AWS

Sign in to the AWS account and go to the Lambda service under “Services.”

According to the above image, we can start creating the function from scratch. There are several runtime environments and I chose Java 8 for this example.

We need to have permission for the lambda function. For that, I have created a new role with the basic lambda permission.

In order to create a lambda function, you need to upload the build jar.

  • Go to the project home and build the Maven project using  mvn clean install .
  • Upload the jar located inside the target folder.
  • Fill the handler info as with
  • Save the function.

Now the lambda function is ready to be invoked via the API Gateway. The best way to secure the lambda function is by calling those services via a gateway. So we use the AWS API gateway to secure the lambda function and it also gives API management.

How to Configure API Gateway

Drag and drop the API Gateway from the left panel. After that, you can configure the API Gateway.

In this configuration, we can define the API as a new API and can provide the security with the API key. After creating a new API you can see the API details with the API key.

Start the serverless function using the AWS API gateway.

The above image shows how testing is done and its integration. This is the default integration of the AWS environment.

The test client, which is provided by AWS, is very good and we can control all input parameters for our function. We can select the method, pass the query params, change the headers, change the request body and so on. They also provide the request and response logs and details related to the response such as HTTP status, response time, response body and headers.

We can call this function using the curl or postman from our local pc. \ -H 'Cache-Control: no-cache' \ -H 'Content-Type: application/json' \ -H 'x-api-key: *****************************' \ -d '{ "requestMessage": "Sidath"}'

from DZone Cloud Zone

Capital One Data Hack Leads to AWS Lawsuit

Capital One Data Hack Leads to AWS Lawsuit


Capital One Data Hack Leads to AWS Lawsuit

The recent hack of Capital One data hosted on Amazon Web Services Inc. (AWS) infrastructure has led to multiple lawsuits, with AWS the target of at least one.

Capital One last month announced the “data security incident” in which a person identified in news reports as a former AWS engineer obtained the personal information of customers and others who applied for credit cards.

The data was stored on AWS infrastructure, a continuing problem for the company even though well-publicized data breaches and exposures are typically found to be caused primarily by user misconfigurations, rather than any inherent cloud platform flaws.

For example, even though the Capital One breach was actively instigated by an individual, a “firewall misconfiguration” was partially blamed for exposing the data to attack.

GeekWire last week reported on the resulting lawsuits, noting that a plaintiff group “also named Amazon Web Services, Capital One’s cloud provider, alleging the tech giant is also culpable for the breach.”

GitHub, the open source code repository and development platform, was also named in a suit for allegedly failing to monitor and respond to hacked data on its site, GeekWire said, providing this summary:

The new lawsuit, filed this week in federal court in Seattle, is unique because it includes Amazon as a defendant. It argues that Amazon knew about a vulnerability allegedly exploited by the hacker, Seattle-based engineer Paige Thompson, to pull off the attack and “did nothing to fix it.” The alleged attacker, a former AWS employee, hacked into a misconfigured Web application firewall.

“The single-line command that exposes AWS credentials on any EC2 system is known by AWS and is in fact included in their online documentation,” according to the complaint. “It is also well known among hackers.”

Newsweek reported that AWS denied any responsibility for the hack. “An Amazon Web Services spokesperson told Newsweek: ‘AWS was not compromised in any way and functioned as designed. The perpetrator gained access through a misconfiguration of the Web application and not the underlying cloud-based infrastructure. As Capital One explained clearly in its disclosure, this type of vulnerability is not specific to the cloud.’ ”

The Capital One hack was nevertheless bad news for AWS, which for years has been plagued by reports of exposed data usually caused by misconfigurations, no matter how much security guidance it publishes to avoid such issues.

About the Author

David Ramel is the editor of Visual Studio Magazine.

from News

AWS Revamps Slow-Loading PowerShell Tools

AWS Revamps Slow-Loading PowerShell Tools


AWS Revamps Slow-Loading PowerShell Tools

Slow load times for the growing number of Windows PowerShell scripts (“cmdlets”) available on the Amazon Web Services Inc. (AWS) cloud platform has resulted in a revamp, currently in preview.

AWS Tools for PowerShell let PowerShell-using AWS customers manage their cloud resources the same way they manage Windows, Linux and MacOS environments.

AWS provides two modules for PowerShell: AWSPowerShell for PowerShell v2 through v5.1 on Windows; and AWSPowerShell.NetCore for PowerShell v6 or higher on Windows, macOS and Linux.

However, after the number of cmdlets in each AWS PowerShell module increased from about 550 in the 2012 debut to nearly 6,000, problems began to appear.

“First, the import time of the modules has grown significantly,” Steve Roberts, senior technical evangelist at AWS in a recent blog post. “On my 8th Gen Core i7 laptop the time to import either module has grown beyond 25 seconds. Second, the team discovered an issue with listing all of the cmdlets in the module manifests and subsequently had to revert to specifying ‘*’ for the CmdletsToExport manifest property. This prevents PowerShell from determining the cmdlets in the modules until they are explicitly imported, impacting tab completion of cmdlet names.

“In my shell profile I use the Set-AWSCredential and Set-DefaultAWSRegion cmdlets to set an initial scope for my shells. Thus I have to first explicitly import the module and then wait for the shell to become usable. This slow load time is obviously unsustainable, even more so when writing AWS Lambda functions in PowerShell when we particularly want a fast startup time.”

The answer was a new set of modules offered in preview in order to solicit developer feedback before the new scheme takes effect. That new scheme places each individual AWS service in its own PowerShell module, with all of them depending on a common shared module called AWS.Tools.Common. This is the same modular approach used for AWS SDK for .NET.

Roberts said the benefits of this approach include:

  • Instead of downloading and installing a single large module for all services, users can now install only the modules for services actually need. The common module will be installed automatically when a service-specific module is installed.
  • Users no longer need to explicitly import any of the preview modules before use, as the CmdletsToExport manifest property for each module is now properly specified.
  • The versioning strategy for the new modules currently follows the AWSPowerShell and AWSPowerShell.NetCore modules. The strategy is detailed on the team’s GitHub repository notice for the preview and AWS welcomes feedback on it.
  • Shell startup time is fast! On the same system Roberts noted earlier the load time for his command shells is now between 1 and 2 seconds on average. “The only change to my shell profile was to remove the explicit module import,” he said.

Also, some old or obsolete cmdlets have been removed from the tools package. The older modules will remain operative and updated along with the preview modules as the AWS team ensures backwards compatibility, and the two module sets can’t be used at the same time.

The new modules are available in the PowerShell Gallery. More details can be found in a GitHub issue.

About the Author

David Ramel is the editor of Visual Studio Magazine.

from News

AWS Step Functions 101

AWS Step Functions 101

We’ll begin with a simple explanation of what are Step Functions. AWS Step Functions service is the most recent service released by none other than Amazon Web Services. The primary goal of Step Functions is to solve problems that often arise by the orchestration of complex flows via Lambda functions.

In most cases, there are several processes made out of different tasks. To be able to run the whole process serverless, you need to create Lambda function for every task, and after you’ve created them, you need to run them via your orchestrator.

You need to write a code that will orchestrate these functions, and successfully writing code like that is not an easy task at all because of optimization and debugging of this code.

AWS Step Functions service will make your life easier by removing the need for all this with its simple design, and it’ll also implement a complex flow for our tasks or functions.

In short, AWS Step Functions will make it easy to manage and coordinate the distributed application’s components along with microservices all thanks to the utilization of the visual workflow.

Why Should You Use Orchestration for Your Application Design?

Consider this — breaking your application into many pieces (service components) allows your system to keep on going even though a single element has failed. Each of these components can scale independently, and there’s no need to redeploy the entire system after every change because every component can be updated on its own.

Scheduling, managing execution dependencies, and concurrency within the logical flow of the application are all involved in the coordination of service components. In applications like this, the developers are able to use service orchestration to handle failures as well.

How Does Step Functions Work?

By using Step Functions, you’re actually defining state machines that describe your workflow as a series of steps, their relationship as well as their outputs and inputs. State machines consist of a number of states, and every state represents an individual step of a workflow diagram. States can pass parameters, make choices, perform work, manage timeouts, terminate your workflow with success or failure, and initiate parallel execution as well. The visual console will automatically graph every state following the execution order.

All of the above makes it very easy to design multi-step apps because the console highlights the status of each step in real-time, and it provides a history of every execution in details.

Connecting Step Function to Other AWS Services

By utilizing service tasks, you can connect and coordinate the workflow you’ve created via Step Functions with other AWS services. What else can you do? Here are some of the examples:

  • You can run Amazon Fargate task or Amazon ECS;

  • You can submit an AWS batch job, but you need to wait for it to complete;

  • You’re able to publish a message on SNS Topic;

  • You can create an Amazon SageMaker job, which can train a machine learning model;

  • You can invoke AWS Lambda function;

  • You can send a message to Amazon SQS queue;

  • You can also place a new item in the DynamoDB table or obtain an existing item from the DynamoDB table.

Step Functions — Monitoring & Logging

How does the monitoring work with Step Functions? Step Functions will send metrics to AWS CloudTrail and Amazon CloudWatch to monitor applications. CloudTrail will capture all API calls for Step Functions as events, while the calls from Step Functions and from code calls are all included to the Step Functions APIs.

CloudWatch can set alarms, collect, and track all the track metrics, and it will automatically react to any changes that occur in Step Functions. Step Functions support CloudWatch Events’ managed rules for every integrated service inside your workflow. Step Functions will create and manage CloudWatch Events rules within your Amazon Web Services account.

The Most Frequent Step Functions’ Use Cases

You can use Step Functions to create an end-to-end workflow which will allow you to manage jobs with interdependencies. The whole idea of Step Functions is to help you solve any business process or computational problem which can be divided into a series of steps. Here are some of the examples:

  • Implement a difficult sign-on authentication and user registration processes for web applications;

  • Creating event-driven apps that can automatically respond to infrastructure changes and building tools for constant deployment and integration. These are crucial for IT automation and DevOps;

  • Processing of data: Unified reports are made by consolidating data from multiple databases. It can reduce and refine large data sets into a more comfortable file format, or even coordinate machine learning workflow and multi-step analytics.

Let’s Wrap Step Functions Up

Primarily, Step Functions can help us achieve higher performance rates by allowing us to break down our application into service components and manipulate them all independently.

Amazon Step Function service is relatively new, and like everything else from AWS, it’ll only get better as time goes by, and it’ll surely become more useful. Discover Step Functions service for yourself. If you have experience with AWS Step Functions, share it with our readers and us in the comment section below. Feel free to start a discussion!

from DZone Cloud Zone

End-to-End Tests: Managing Containers in Kubernetes

End-to-End Tests: Managing Containers in Kubernetes

Our infrastructure is more complex than ever, and there is greater pressure to deliver quality features to customers on time. To meet these needs, automated end-to-end tests play an important role in our continuous integration and delivery process. Let’s look at how we can execute these tests in a container within a Kubernetes cluster on Google Cloud Kubernetes Engine.

As software applications transition towards a microservices architecture and platforms become more cloud-native, these shifts have changed how development teams build and test software. Each component of the application is packaged in its own container.

To scale and manage these containers, organizations are turning to orchestration platforms like Kubernetes. Kubernetes is a well-known open-source orchestration engine for automating deployment, scaling, and management of containerized applications at scale, whether they run in a private, public, or hybrid cloud.

With increased complexity in the infrastructure and the need for timely delivery of quality features to customers, automated end-to-end tests play an important role in our continuous integration and delivery process. Let’s look at how we can execute these tests in a container within a Kubernetes cluster on Google Cloud Kubernetes Engine.

Building the Testing Container Image

We start by building our tests, which are written in TestNG using Selenium WebDriver, into a container image. The image includes all the test files, libraries, drivers, and a properties file, as well as the Shell script to start the tests.

Below, you’ll find some sample code that should give you a sense of how you can structure and configure your testing, including snippets from the following files:


FROM centos:7.3.1611
RUN yum install -y \
java-1.8.0-openjdk \
ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk/

# Set local to UTF8
RUN localedef -i en_US -f UTF-8 en_US.UTF-8

# Install automation tests
COPY ./build/libs ${test_dir}/bin
COPY ./build/lib ${test_dir}/lib
COPY ./testfiles ${test_dir}/testfiles
COPY ./build/resources ${test_dir}/resources
COPY ./build/drivers ${test_dir}/drivers
RUN chmod -R 755 ${test_dir}/drivers
COPY ./*.properties ${test_dir}/

WORKDIR /opt/automation
USER root

CMD [ "./", "TestSuite" ]


# run test suite
function listOfSuites() {
for i in $(echo $1 | sed "s/,/ /g")
# call your procedure/other scripts here below
echo "$i"
fullPath=`find . -type f -name "$i.xml"`
#echo "fullList=$fullList"
#echo "fullPath=$fullPath"
fullList="$fullList$fullPath "
listOfSuites $1
echo $fullList

java -Dlog4j.configuration=resources/main/ -cp "./lib/*:./bin/*:." \
org.testng.TestNG $fullList

java -cp "./lib/*:./bin/*:." com.gcp.UploadTestData


buildscript {
repositories { maven { url "${nexus}" } }
dependencies {
classpath 'com.bmuschko:gradle-docker-plugin:3.6.2'

apply plugin: com.bmuschko.gradle.docker.DockerRemoteApiPlugin

import com.bmuschko.gradle.docker.tasks.image.DockerBuildImage

def gcpLocation = ''
def gcpProject = 'automation'
def dockerRegistryAndProject = "${gcpLocation}/${gcpProject}"

// Provide a lazily resolved $project.version to get the execution time value, which may include -SNAPSHOT
def projectVersionRuntime = "${-> project.version}-${buildNumber}"
def projectVersionConfigurationTime = "${project.version}-${buildNumber}"
def projectRpm = "${projectName}-${projectVersionConfigurationTime}.${arch}.rpm"

// We want to use the branch name as part of the GCR tag. However we don't want the raw branch name,

// so we strip out symbols and non alpha-numerics. We also strip out git branch text that contains
// remotes/origin or origin/, since we don't care about that.
def sanitize = { input ->
return input.replaceAll("[^A-Za-z0-9.]", "_").toLowerCase().replaceAll("remotes_origin_", "").replaceAll("origin_", "");

def gitbranchNameRev = 'git name-rev --name-only HEAD'.execute().text.trim()
def gcpGitbranch = System.env.GIT_BRANCH ?: (project.hasProperty('gitbranch')) ? "${gitbranch}" : "${gitbranchNameRev}"
def gitbranchTag = sanitize(gcpGitbranch)

def projectVersionRuntimeTag = sanitize("${-> project.version}")
def dockerTag = "${dockerRegistryAndProject}/${projectName}:${projectVersionRuntimeTag}-${buildNumber}-${gitbranchTag}-${githash}"
def buildType = System.env.BUILD_NUMBER ? "JENKINS" : "LOCAL"

//Create file containing build information. This is for the build environment to pass onto
// other upstream callers that are unable to figure out this information on their own.
task versionProp() {
onlyIf { true }
doLast {
new File("$project.buildDir/").text = """APPLICATION=${projectName}
VERSION=${-> project.version}
TIMESTAMP=${new Date().format('yyyy-MM-dd HH:mm:ss')}


// make sure the below version file generation is always run after build
build.finalizedBy versionProp
task dockerPrune(type: Exec) {
description 'Run docker system prune --force'
group 'Docker'

commandLine 'docker', 'system', 'prune', '--force'

task buildDockerImage(type: DockerBuildImage) {
description 'Build docker image locally'
group 'Docker'
dependsOn buildRpm
inputDir project.buildDir

buildArgs = [
'rpm': "${projectRpm}",
'version': "${projectVersionConfigurationTime}"

doFirst {
// Copy the Dockerfile to the build directory so we can limit the context provided to the docker daemon
copy {
from 'Dockerfile'
into "${project.buildDir}"

copy {
from 'docker'
into "${project.buildDir}/docker"
include "**/*jar"

println "Using the following build args: ${buildArgs}"

// This block will get the execution time value of $project.version, which may include -SNAPSHOT</span
tag "${dockerTag}"

task publishContainerGcp(type: Exec) {
description 'Publish docker image to GCP container registry'
group 'Google Cloud Platform'
dependsOn buildDockerImage

commandLine 'docker', 'push', "${dockerTag}"

Selenium Grid Infrastructure Setup

Our end-to-end tests use Selenium WebDriver to execute browser-related tests, and we have a scalable container-based Zalenium Selenium grid deployed in a Kubernetes cluster (you can see setup details here).

You can configure the grid URL and browser in the file included in the test container image:

# webDriver settings

Deploying Test Containers in Kubernetes Cluster

Now that we have the container image built and pushed to the Google Cloud Platform, let’s deploy the container in Kubernetes. Here’s a snippet of the template for the manifest file to execute tests as a Kubernetes job:

apiVersion: batch/v1
kind: Job

jobgroup: runtest
name: runtest
namespace : automation
jobgroup: runtest
- name: testcontainer
command: ["./", "${TEST_SUITE}"]
value: "${SELENIUM_GRID}"
value: "${BROWSER_TYPE}"
- name: TARGET_URL
value: "${TARGET_URL}"
value : "${GCP_CREDENTIALS}"
value: "${GCP_BUCKET_NAME}"
value : "${BUCKET-FOLDER}"
restartPolicy: Never

And, here’s the command to create the job:

kubectl apply -f ./manifest.yaml --namespace automation

Publishing Test Results

Once test execution is complete, you can upload test results to your Google Cloud Storage bucket using a code snippet similar to this:

public static void main(String... args) throws Exception {

String gcp_credentials = readEnvVariable(GCP_CREDENTIALS);
String gcp_bucket = readEnvVariable(GCP_BUCKET_NAME);
String bucket_folder_name = readEnvVariable(BUCKET_FOLDER);

// authentication on gcloud

// define source folder and destination folder
String source_folder = String.format("%s/%s", System.getProperty("user.dir"), TEST_RESULTS_FOLDER_NAME);
String destination_folder = "";
if (!bucket_folder_name.isEmpty()) {
destination_folder = bucket_folder_name;
} else {
>String timeStamp = new
>destination_folder = TEST_RESULTS_FOLDER_NAME + "-" + timeStamp;
>System.out.println("destination folder: " + destination_folder);
// get gcp bucket for automation-data
Bucket myBucket = null;
if (!gcp_bucket.isEmpty()) {
myBucket = storage.get(gcp_bucket);
// upload files
List<File> files = new ArrayList<File>();
GcpBucket qaBucket = new GcpBucket(myBucket, TEST_RESULTS_FOLDER_NAME);
if (qaBucket.exists()) {
qaBucket.createBlobFromDirectory(destination_folder, source_folder, files);
System.out.println(files.size() + "files are uploaded to " + destination_folder);

Next Steps

We’re looking into publishing test results in a centralized result database fronted by an API service. This allows users to easily post test result data for test result monitoring and analytics, which I will cover in a future blog post about building a centralized test results dashboard. Until then, I hope this post has helped you put all the pieces together for executing automated end-to-end testing using Kubernetes!

from DZone Cloud Zone

Serverless Fraud Detection Using Amazon Lambda, Node.js, and Hazelcast Cloud

Serverless Fraud Detection Using Amazon Lambda, Node.js, and Hazelcast Cloud

Recently, an interesting paper was published by UC Berkeley with their review of serverless computing and quite reasonable predictions:

“…Just as the 2009 paper identified challenges for the cloud and predicted they would be addressed and that cloud use would accelerate, we predict these issues are solvable and that serverless computing will grow to dominate the future of cloud computing.”

So, why should the industry go serverless at all? A simple answer is that we, as software engineers, are eager to be effective and want to focus on the result. Serverless computing is working towards that:

  • Your cloud provider bill gets lower since you pay only for what you use
  • You get more elastic scalability due to the more compact computation units (Functions)
  • You do not have to write much of the infrastructure code, which makes the overall system far less complex
  • In the end, it is cloud-native, i.e. things like discovery, fault tolerance, and auto-scaling come out of the box

Sounds like something worth adopting, right?

In this tutorial, we will build the complete serverless solution using Amazon Lambda, Node.js, and Hazelcast Cloud – a managed service for fast and scalable in-memory computing.

Problem Overview

Every time your credit or debit card initiates a transaction, the underlying payment system applies various checks to verify transaction validity. There are simple checks, such as verifying available funds. However, others can be far more advanced, such as looking around the history of card payments to identify a personalized pattern and validate the given transaction against it.

For this demo, we are going to use a simplified approach. We will be checking the validity of a limited subset of transactions performed at the airports, and the responsible bank, which in our case is the Bank of Hazelcast. You can also find this problem mentioned initially by my colleague Neil Stevenson in his blog post.

So, let’s go over the entire algorithm step by step.

  1. Consider two transactions, A and B, which take place in different airports: Image title
  2. What we do is we compare two dimensions:
  • The time between transactions A and B
  • Distance between given airports (we use their coordinates to calculate that)

By comparing them, we determine whether the person could move from one location to another within a given time frame. More specifically, if transaction A is performed in London and transaction B in Paris, and time between them is not less than two hours, then we consider this a valid scenario:Image title

3. The opposite example below will be identified as suspicious because Sydney is more than two hours away:

Image title

Serverless Solution Design

Now, we will explore the high-level architecture of our serverless solution:

Image title

As you can see, it’s composed of the following components:

  1. Amazon S3 handles uploads of the mostly static dataset with information about airports.
  2. Lambda Function “AirportsImportFn” is triggered by the upload into S3 buckets. Once triggered, it imports the dataset into the Hazelcast Cloud cluster so that it can be queried with the lowest latency.
  3. Amazon API Gateway and Lambda Function “ValidateFn” to serve the HTTP traffic. This Lambda function written in Node.js implements the actual fraud detection logic to validate the received requests. It communicates with the Hazelcast Cloud cluster to manage its state.
  4. Hazelcast Cloud is a managed service for Hazelcast IMDG — an open-source, in-memory data grid for fast, scalable data processing. Minimal latency, auto-scaling, and developer-oriented experience — this is why we chose it for our solution. 

We call it the Complete Serverless Solution since it employs both kinds of the Serverless Components – Function-as-a-Service (FaaS) provided by Amazon Lambda and Backend-as-a-Service (BaaS), a Hazelcast IMDG cluster managed by Hazelcast Cloud. The whole stack is co-located in the same AWS region to ensure the shortest network paths across the components.

Node.js Implementation

As mentioned at the beginning, software engineers want to apply an effective solution to the problem. We must be flexible in choosing an appropriate tool while avoiding limitations on programming language, technology, framework, etc. This is how we do it in 2019, right? And this is why we choose Node.js for the Lambda function implementation, one of the modern serverless runtimes supported by the major FaaS providers. 

We all know talk is cheap, so let’s explore the source code.

Lambda Function “AirportsImportFn”

const hazelcast = require('./hazelcast');
const aws = require('aws-sdk');

let sharedS3Client = null;

let getS3Client = () => {
   if (!sharedS3Client) {
       console.log("Creating S3 client...")
       sharedS3Client = new aws.S3();
   return sharedS3Client;

exports.handle = async (event, context, callback) => {
   console.log('Got event: ' + JSON.stringify(event));
   context.callbackWaitsForEmptyEventLoop = false;
   let hazelcastClient = await hazelcast.getClient();
   let map = await hazelcastClient.getMap('airports');
   if (await map.isEmpty() && event.Records.length > 0) {
       let srcBucket = event.Records[0];
       console.log('Handling upload into bucket \'' + srcBucket + '\'...');
       let srcKey = decodeURIComponent(event.Records[0].s3.object.key.replace(/\+/g, " "));
       let s3Client = getS3Client();
       let object = await s3Client.getObject({Bucket: srcBucket, Key: srcKey}).promise();
       let airports = JSON.parse(object.Body);
       await map.putAll( => ([airport.code, airport])));
       console.log('Imported data about ' + airports.length + ' airports');
       return callback(null, true);
   return callback(null, false);

So, this is what we implemented:

  • Defined a global variable to reuse the S3 client instance between the function invocations (later, we will discuss why it’s important)
  • Exported function “handle” implements the actual business logic of our Lambda function. It processes the incoming S3 event, reads the JSON contents of the uploaded object, deserializes it into an array, and then re-maps it to key-value pairs before storing in the Hazelcast map
  • Finally, we call an Amazon Lambda callback to return the result

Hazelcast Map (IMap) is a distributed hash map. Through the Node.js client, you can perform operations like reading and writing from/to a Hazelcast Map with the well-known get and put methods. For details, see the Map section in the Hazelcast IMDG Reference Manual.

Lambda Function “ValidateFn”

const hazelcast = require('./hazelcast');
const haversine = require('haversine');
const moment = require('moment');

exports.handle = async (request, context, callback) => {
   console.log('Got request: ' + JSON.stringify(request));
   context.callbackWaitsForEmptyEventLoop = false;
   let userId = request.userId;
   let requestTimestampMillis = moment(request.transactionTimestamp).utc().valueOf();
   let hazelcastClient = await hazelcast.getClient();
   let airports = await hazelcastClient.getMap('airports');
   if (await airports.isEmpty()) {
       return callback('Airports data is not initialized', null);
   let users = await hazelcastClient.getMap('users');
   let user = await users.get(userId);
   if (!user) {
       await users.set(userId, {
           userId: userId,
           lastCardUsePlace: request.airportCode,
           lastCardUseTimestamp: requestTimestampMillis
       return callback(null, {valid: true, message: 'User data saved for future validations'});
   let [lastAirport, nextAirport] = await Promise.all([airports.get(user.lastCardUsePlace),
   if (lastAirport.code === nextAirport.code) {
       return callback(null, {valid: true, message: 'Transaction performed from the same location'});
   let speed = getSpeed(lastAirport, user.lastCardUseTimestamp, nextAirport, request.transactionTimestamp);
   let valid = speed <= 13000; // 800 km/hr == ~13000 m/min let message = valid ? 'Transaction is OK' : 'Transaction is suspicious'; // Update user data user.lastCardUsePlace = request.airportCode; user.lastCardUseTimestamp = requestTimestampMillis; await users.set(userId, user); return callback(null, {valid: valid, message: message}); }; let getSpeed = (lastAirport, lastUseTimestamp, nextAirport, requestTimestamp) => {
   // Time
   let minutes = moment(requestTimestamp).diff(lastUseTimestamp, 'minutes');
   // Distance
   let meters = haversine(nextAirport, lastAirport, {unit: 'meter'});
   // Speed
   return meters / minutes;

Let’s look at what we need to do step by step:

  1. First, we set this:
    context.callbackWaitsForEmptyEventLoop = false;

This is the Amazon Lambda-specific setting to prevent waiting until the Node.js runtime event loop is empty. Here, you can find more info about the given setting.

2. Then our validation logic checks whether we already have any data for a user associated with the incoming request. If it’s a new user, we save it for the future validations and return a corresponding result:

{valid: true, message: 'User data saved for future validations'}

3. If there is available data about the previous transaction, we proceed by retrieving info about the current and prior airports:

let [lastAirport, nextAirport] = await Promise.all([airports.get(user.lastCardUsePlace),

And skip the validation if the airports are the same:

{valid: true, message: 'Transaction performed from the same location'}

4. After that, we use the haversine formula to calculate a “user speed” between two transactions. If it’s bigger than an average plane’s speed, we conclude that the transaction is suspicious: 

let valid = speed <= 13000; // 800 km/hr == ~13000 m/min
let message = valid ? 'Transaction is OK' : 'Transaction is suspicious';

At the end of our algorithm, we store the data from the request for the future validations:

await users.set(userId, user);

Hazelcast Client Module

To better organize our codebase, we’ve extracted the Node.js Hazelcast Client setup into a separate module within hazelcast.js:

const Client = require('hazelcast-client').Client;
const ClientConfig = require('hazelcast-client').Config.ClientConfig;

let sharedHazelcastClient = null;

let createClientConfig = () => {
   let cfg = new ClientConfig(); = process.env.GROUP;
   cfg.groupConfig.password = process.env.PASSWORD;
   cfg.networkConfig.cloudConfig.enabled = true;
   cfg.networkConfig.cloudConfig.discoveryToken = process.env.DISCOVERY_TOKEN;[''] = '';['hazelcast.client.statistics.enabled'] = true;['hazelcast.client.statistics.period.seconds'] = 1;['hazelcast.client.heartbeat.timeout'] = 3000000;
   return cfg;

module.exports.getClient = async () => {
   if (!sharedHazelcastClient) {
       console.log('Creating Hazelcast client...');
       sharedHazelcastClient = await Client.newHazelcastClient(createClientConfig());
   return sharedHazelcastClient;

The main idea here is to re-use the Hazelcast Client instance between invocations. This is an optimization that is still available even though you deal with the ephemeral and stateless Function containers. Learn more about the Amazon Lambda function lifecycle in this corresponding article from AWS Compute Blog. Also, something worth mentioning is that Hazelcast settings are configured via environment variables — this is one of the suggested ways to configure the Lambda Function instance. Later, we will see how to set them.


And the final spurt — we need to deploy the whole solution to your AWS account. Here are the steps:

1. We will start by creating our Lambda functions and the first candidate will be “ImportAirportsFn” — the function triggered by upload into the S3 bucket. A minimal deployment package for Amazon Lambda Node.js environment consists of the actual JS file with function handler and the required libraries for your code. Both should be zipped and passed as an argument to AWS CLI create-function command:

$ zip -r import.js hazelcast.js node_modules
$ aws lambda create-function --function-name --role lambda_role_arn --zip-file fileb:// --handler import.handle --description "Imports Airports from S3 into Hazelcast Cloud" --runtime nodejs8.10 --region us-east-1 --timeout 30 --memory-size 256 --publish --profile aws_profile

Let’s quickly review the arguments:

  •  ImportAirportsFn – a logical function name
  •  lambda_role_arn – beforehand, you should create a Lambda Execution Role with the given policy attached; it gives basic permission to upload the logs and to read data from S3
  •  aws_profile – here, we use AWS Profile-based access to work with AWS CLI
  •  us-east-1 – the region that we’re going to use for our deployment
  •  import.handle – this is for Amazon Lambda to lookup the function handler; it’s set in the format  js_file_name.exported_function_name

After this, we should run just two more commands to make our function triggered by the uploads into S3 bucket:

$ aws lambda add-permission --function-name ImportAirportsFn --action lambda:InvokeFunction --principal --source-arn s3_bucket_arn --statement-id ncherkas-js-demo-lambda-permission --region us-east-1 --profile aws_profile

Where s3_bucket_arn is the ARN of the S3 bucket, it has a simple format like “ arn:aws:s3:::your_bucket_name” and ” ncherkas-js-demo-lambda-permission.” This is a logical id for the permission instance.

$ aws s3api put-bucket-notification-configuration --bucket your_bucket_name --notification-configuration file://s3_notification.json --profile aws_profile --region us-east-1

For this last one, you can use a sample notification config here. Just replace “LambdaFunctionArn” inside of it with a valid ARN of the Lambda Function.

Please note that the S3 bucket should be created beforehand. I’d suggest placing it in the same AWS region i.e. us-east-1.

2. Now, let’s set up the Hazelcast Cloud cluster so that we can start using it for the low-latency state management required by our Lambda Functions. Go and create your account by clicking “Launch For Free.” Once you’ve entered a console, proceed by creating a new Hazelcast cluster:

Image title

Before pressing “+ Create Cluster,” click on “+Add” button next to the “Custom Map Configuration” setting to specify an expiration time for the user data we’re going to store. We do this for the user data since we don’t need to save it for a period of more than a few days, this is an excellent chance to optimize our storage!

Image title

After this, press “+Create Cluster,” and let’s have a brief overview of the options we specified for our Hazelcast Cluster:

  • Cloud Provider – a cloud provider that you want to use, which is AWS in the case of our tutorial
  • Region – AWS region that you’re going to use; let’s set the same region that we use for our Lambda Functions, i.e. us-east-1.
  • Cluster Type – a bunch of options available here; let’s leverage trial 50 $ and go with a Small type
  • Enable Auto-Scaling – when it’s ON, our cluster will automatically increase its capacity as the amount of stored data increases
  • Enable Persistence – backed with the Hazelcast Hot Restart Persistence; this avoids losing our data between the Hazelcast Cluster restarts
  • Enable Encryption – this is what you’d like to enable if you process the real payment transactions. Under the hood, it leverages the Hazelcast Security Suite
  • Enable IP Whitelist – this is, again, what you’d want to switch ON when you set up a production environment

Now that we have our Hazelcast Cluster up and running, what’s next? How do we connect to it? This is quite easy to do — go and click “Configure Client” to see what’s available:

Image title

Here, you have two options. You can download a ready code sample that is working with your cluster. Or, as we will do in the case of Lambda Function, you can configure your client manually. As you remember from the source code, we configure our client using the environment variables, but how do they propagate to our Lambda Function? Here is a command that we need to run:

$ aws lambda update-function-configuration --function-name ImportAirportsFn --environment Variables="{GROUP=<cluster_group_name>,PASSWORD=<password>,DISCOVERY_TOKEN=<token>" --profile aws_profile --region us-east-1

In this command, we take  cluster_group_name,  password, and  token from the “Configure Client” dialog that we were exploring above.

Now, let’s have some fun and test our Lambda Function “ ImportAirportsFn”. To do this, upload the given sample file with the airport’s data into S3 bucket that we created earlier in this tutorial. Once complete, go to Amazon Cloud Watch Console, click “Logs” and open a stream called “/aws/lambda/ImportAirportsFn”. Inside of the stream, the recent logs will go on the top. Let’s explore them:

Image title

As you can see, our dataset with the airport’s data was successfully copied into the corresponding Hazelcast map. Now, we can query it to validate the payment transactions; this is what we want to have at the end — the serverless fraud detection. Let’s move on and complete the last deployment step.

3. Creating Lambda Function “ValidateFn” and setting up the Amazon API Gateway.

Run the AWS CLI command to create a new function:

$ zip -r validate.js hazelcast.js node_modules
$ aws lambda create-function --function-name ValidateFn --role lambda_role_arn --zip-file fileb:// --handler validate.handle --description "Validates User Transactions" --runtime nodejs8.10 --environment Variables="{GROUP=<cluster_group_name>,PASSWORD=<password>,DISCOVERY_TOKEN=<token>}" --region us-east-1 --timeout 30 --memory-size 256 --publish --profile aws_profile

As you can see, we establish the Hazelcast Cluster settings right away since our cluster is already up and running.

Before proceeding with the API gateway, we can quickly test our function within the AWS console:

1) Click “Select a test event” and configure a test JSON payload:

Image title

2) Click “Test” and explore the results:

Image title

Congrats! You’re now very close to getting the complete solution up and running!

And the last, yet crucial step — setting up the API gateway endpoint.

Go to Amazon API gateway and create a new API:

Image title

Then, create a new API Resource:

Image title

And add a POST mapping for our API:

Image title

Choose OK when prompted with Add Permission to Lambda Function.

After this, we need to configure a request body mapping. Go to “Models” under the newly created API and create a new model:

Image title

Then click on POST method and press “Method Request” to apply this mapping model:

Image title

We are ready now to deploy and test the Validate API. To do this, go Actions – Deploy API and create a new stage:

Image title

After it’s deployed, you will get the Invoke URL that exposes our newly created API. Let’s use this URL and do final testing (here, I use HTTPie, which is a good alternative to cURL):

$ http POST <invoke_url>/validate userId:=12345 airportCode=FRA transactionTimestamp=2019-03-18T17:55:40Z

    "message": "Transaction performed from the same location",
    "valid": true

Now, let’s try to emulate a suspicious transaction. The previous request was performed at Mon Mar 18 2019, 13:51:40 in Frankfurt, so let’s send another request for the transaction performed in New York, a few minutes later:

$ http POST <invoke_url>/validate userId:=12345 airportCode=EWR transactionTimestamp=2019-03-18T18:02:10Z

    "message": "Transaction is suspicious",
    "valid": false

We get “valid”: false since it’s unrealistic for the person to move from Frankfurt to New York in that short period. And one more example, a valid transaction:

$ http POST <invoke_url>/validate userId:=12345 airportCode=LCY transactionTimestamp=2019-03-19T02:20:30Z

    "message": "Transaction is OK",
    "valid": true

Here, we have “valid”: true because the given transaction was performed from London later at night, which seems reasonable. 

Also, it makes sense to inspect the performance of our solution. Below, you can see a graph of the average latency after running a simple 10-minute test, continuously sending validate requests to the API Gateway URL:

Image title

As you can see, the average duration of ValidateFn function was mostly less than 10 ms. during this test. This confirms the choice we’ve made by solving our problem with the in-memory computing solution. Note that in the case of Amazon Lambda, the less your latency is, the less you pay for the usage, so using the In-Memory Computing solution helps to reduce the processing costs.

Impressive! At this point, Serverless Fraud Detection is up and running! Thanks for joining me on this journey!

Now, for a summary…


This tutorial has shown that serverless computing is not just hype but is something worth learning and adopting. With reasonable efforts, you get a scalable, fault-tolerant, and performant solution with in-memory computing enhancing all of these attributes. Check out our community site to learn about more cool things you can do with the Hazelcast Node.js Client and Hazelcast Cloud.

from DZone Cloud Zone

Canary deployments with Consul Service Mesh

Canary deployments with Consul Service Mesh

This is the fourth post of the blog series highlighting new features in Consul service mesh.

Last month at HashiConf EU we announced Consul 1.6.0. This release delivers a set of new Layer 7 traffic management capabilities including L7 traffic splitting, which enables canary service deployments.

This blog post will walk you through the steps necessary to split traffic between two upstream services.

Canary Deployments

A Canary deployment is a technique for deploying a new version of a service, while avoiding downtime. During a canary deployment you shift a small percentage of traffic to a new version of a service while monitoring its behavior. Initially you send the smallest amount of traffic possible to the new service while still generating meaningful performance data. As you gain confidence in the new version you slowly increase the proportion of traffic it handles. Eventually, the canary version handles 100% of all traffic, at which point the old version can be completely deprecated and then removed from the environment.

The new version of the service is called the canary version, as a reference to the “canary in a coal mine”.

To determine the correct function of the new service, you must have observability into your application. Metrics and tracing data will allow you to determine that the new version is working as expected and not throwing errors. In contrast to Blue/Green deployments, which involve transitioning to a new version of a service in a single step, Canary deployments take a more gradual approach, which helps you guard against service errors that only manifest themselves with a particular load.


The steps in this guide use Consul’s service mesh feature, Consul Connect. If you aren’t already familiar with it you can learn more by following this guide.

We created a demo environment for the steps we describe here. The environment relies on Docker and Docker Compose. If you do not already have Docker and Docker Compose, you can install them from docker’s install page.


The demo architecture you’ll use consists of 3 services, a public Web service, two versions of the API service, and a Consul server. The services make up a two-tier application; the Web service accepts incoming traffic and makes an upstream call to API service. You’ll imagine that version 1 of the API service is already running in production and handling traffic, and that version 2 contains some changes you want to ship in a canary deployment.
Consul Traffic splitting

To deploy version 2 of your API service, you will:
1. Start an instance of the v2 API service in your production environment.
2. Set up a traffic split to make sure v2 doesn’t receive any traffic at first.
3. Register v2 so that Consul can send traffic to it.
4. Slowly shift traffic to v2 and a way from v1 until the new version is handling all the traffic.

Starting the Demo Environment

First clone the repo containing the source and examples for this blog post.
$ git clone [email protected]:hashicorp/consul-demo-traffic-splitting.git

Change directories into the cloned folder, and start the demo environment with docker-compose up. This command will run in the foreground, so you’ll need to open a new terminal window after you run it.

$ docker-compose up

Creating consul-demo-traffic-splittingapiv11 … done
Creating consul-demo-traffic-splitting
consul1 … done
Creating consul-demo-traffic-splitting
web1 … done
Creating consul-demo-traffic-splitting
webenvoy1 … done
Creating consul-demo-traffic-splittingapiproxyv11 … done
Attaching to consul-demo-traffic-splittingconsul1, consul-demo-traffic-splittingweb1, consul-demo-traffic-splittingapiv11, consul-demo-traffic-splittingwebenvoy1, consul-demo-traffic-splittingapiproxyv11

The following services will automatically start in your local Docker environment and register with Consul.
Consul Server
Web service with Envoy sidecar
API service version 1 with Envoy sidecar

You can see Consul’s configuration in the consul_config folder, and the service definitions in the service_config folder.

Once everything is up and running, you can view the health of the registered services by looking at the Consul UI at http://localhost:8500. All services should be passing their health checks.

Curl the Web endpoint to make sure that the whole application is running. You will see that the Web service gets a response from version 1 of the API service.

$ curl localhost:9090
Hello World

Upstream Data: localhost:9091

Service V1%

Initially, you will want to deploy version 2 of the API service to production without sending any traffic to it, to make sure that it performs well in a new environment. Prevent traffic from flowing to version 2 when you register it, you will preemptively set up a traffic split to send 100% of your traffic to version 1 of the API service, and 0% to the not-yet-deployed version 2. Splitting the traffic makes use of the new Layer 7 features built into Consul Service Mesh.

Configuring Traffic Splitting

Traffic Splitting uses configuration entries (introduced in Consul 1.5 and 1.6) to centrally configure the services and Envoy proxies. There are three configuration entries you need to create to enable traffic splitting:
Service Defaults for the API service to set the protocol to HTTP.
Service Splitter which defines the traffic split between the service subsets.
Service Resolver which defines which service instances are version 1 and 2.

Configuring Service Defaults

Traffic splitting requires that the upstream application uses HTTP, because splitting happens on layer 7 (on a request by request basis). You will tell Consul that your upstream service uses HTTP by setting the protocol in a “service defaults” configuration entry for the API service. This configuration is already in your demo environment at l7_config/api_service_defaults.json. It looks like this.

"kind": "service-defaults",
"name": "api",
"protocol": "http"

The kind field denotes the type of configuration entry which you are defining; for this example, service-defaults. The name field defines which service the service-defaults configuration entry applies to. (The value of this field must match the name of a service registered in Consul, in this example, api.) The protocol is http.

To apply the configuration, you can either use the Consul CLI or the API. In this example we’ll use the configuration entry endpoint of the HTTP API, which is available at http://localhost:8500/v1/config. To apply the config, use a PUT operation in the following command.

$ curl localhost:8500/v1/config -XPUT -d @l7_config/api_service_defaults.json

For more information on service-defaults configuration entries, see the documentation

Configuring the Service Resolver

The next configuration entry you need to add is the Service Resolver, which allows you to define how Consul’s service discovery selects service instances for a given service name.

Service Resolvers allow you to filter for subsets of services based on information in the service registration. In this example, we are going to define the subsets “v1” and “v2” for the API service, based on its registered metadata. API service version 1 in the demo is already registered with the tags v1 and service metadata version:1. When you register version 2 you will give it the tag v2 and the metadata version:2. The name field is set to the name of the service in the Consul service catalog.

The service resolver is already in your demo environment at l7_config/api_service_resolver.json and it looks like this.

"kind": "service-resolver",
"name": "api",

"subsets": {
"v1": {
"filter": "Service.Meta.version == 1"
"v2": {
"filter": "Service.Meta.version == 2"

Apply the service resolver configuration entry using the same method you used in the previous example.

$ curl localhost:8500/v1/config -XPUT -d @l7_config/api_service_resolver.json

For more information about service resolvers see the documentation.

Configure Service Splitting – 100% of traffic to Version 1

Next, you’ll create a configuration entry that will split percentages of traffic to the subsets of your upstream service that you just defined. Initially, you want the splitter to send all traffic to v1 of your upstream service, which prevents any traffic from being sent to v2 when you register it. In a production scenario, this would give you time to make sure that v2 of your service is up and running as expected before sending it any real traffic.

The configuration entry for Service Splitting is of kind of service-splitter. Its name specifies which service that the splitter will act on. The splits field takes an array which defines the different splits; in this example, there are only two splits; however, it is possible to configure more complex scenarios. Each split has a weight which defines the percentage of traffic to distribute to each service subset. The total weights for all splits must equal 100. For our initial split, we are going to configure all traffic to be directed to the service subset v1.

The service splitter configuration already exists in your demo environment at l7_config/api_service_splitter_100_0.json and looks like this.

"kind": "service-splitter",
"name": "api",
"splits": [
"weight": 100,
"service_subset": "v1"
"weight": 0,
"service_subset": "v2"

Apply this configuration entry by issuing another PUT request to the Consul’s configuration entry endpoint of the HTTP API.

$ curl localhost:8500/v1/config -XPUT -d @l7_config/api_service_splitter_100_0.json

This scenario is the first stage in our Canary deployment; you can now launch the new version of your service without it immediately being used by the upstream load balancing group.

Start and Register API Service Version 2

Next you’ll start the canary version of the API service (version 2), and register it with the settings that you used in the configuration entries for resolution and splitting. Start the service, register it, and start its connect sidecar with the following command. This command will run in the foreground, so you’ll need to open a new terminal window after you run it.

$ docker-compose -f docker-compose-v2.yml up

Check that the service and its proxy have registered by looking for a new v2 tags next to the API service and API sidecar proxies in the Consul UI.

Configure Service Splitting – 50% Version 1, 50% Version 2

Now that version 2 is running and registered, the next step is to gradually increase traffic to it by changing the weight of the v2 service subset in the service splitter configuration. Let’s increase the weight of the v2 service to 50%. Remember; total service weight must equal 100, so you also reduce the weight of the v1 subset to 50. The configuration file is already in your demo environment at l7_config/api_service_splitter_50_50.json and it looks like this.

"kind": "service-splitter",
"name": "api",
"splits": [
"weight": 50,
"service_subset": "v1"
"weight": 50,
"service_subset": "v2"

Apply the configuration as before.

$ curl localhost:8500/v1/config -XPUT -d @l7_config/api_service_splitter_50_50.json

Now that you’ve increased the percentage of traffic to v2, curl the web service again. You will see traffic equally distributed across both of the service subsets.

$ curl localhost:9090
Hello World

Upstream Data: localhost:9091

Service V1%
$ curl localhost:9090
Hello World

Upstream Data: localhost:9091

Service V2%
$ curl localhost:9090
Hello World

Upstream Data: localhost:9091

Service V1%

If you were actually performing a canary deployment you would want to choose a much smaller percentage for your initial split: the smallest possible percentage that would give you reliable data on service performance. You would then slowly increase the percentage by iterating over this step as you gained confidence in version 2 of your service. Some companies may eventually choose to automate the ramp up based on preset performance thresholds.

Configure Service Splitting – 100% Version 2

Once you are confident that the new version of the service is operating correctly, you can send 100% of traffic to the version 2 subset. The configuration for a 100% split to version 2 looks like this.

"kind": "service-splitter",
"name": "api",
"splits": [
"weight": 0,
"service_subset": "v1"
"weight": 100,
"service_subset": "v2"

Apply it with a call to the HTTP API config endpoint as you did before.

$ curl localhost:8500/v1/config -XPUT -d @l7_config/api_service_splitter_0_100.json

Now when you curl the web service again. 100% of traffic is sent to the version 2 subset.

$ curl localhost:9090
Hello World

Upstream Data: localhost:9091

Service V2%
$ curl localhost:9090
Hello World

Upstream Data: localhost:9091

Service V2%
$ curl localhost:9090
Hello World

Upstream Data: localhost:9091

Service V2%

Typically in a production environment, you would now remove the version 1 service to release capacity in your cluster. Congratulations, you’ve now completed the deployment of version 2 of your service.

Clean up

To stop and remove the containers and networks that you created you will run docker-compose down twice: once for each of the docker compose commands you ran. Because containers you created in the second compose command are running on the network you created in the first command, you will need to bring down the environments in the opposite order that you created them in.

First you’ll stop and remove the containers created for v2 of the API service.

$ docker-compose -f docker-compose-v2.yml down
Stopping consul-demo-traffic-splitting_api_proxy_v2_1 ... done
Stopping consul-demo-traffic-splitting_api_v2_1 ... done
WARNING: Found orphan containers (consul-demo-traffic-splitting_api_proxy_v1_1, consul-demo-traffic-splitting_web_envoy_1, consul-demo-traffic-splitting_consul_1, consul-demo-traffic-splitting_web_1, consul-demo-traffic-splitting_api_v1_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Removing consul-demo-traffic-splitting_api_proxy_v2_1 ... done
Removing consul-demo-traffic-splitting_api_v2_1 ... done
Network consul-demo-traffic-splitting_vpcbr is external, skipping

Then, you’ll stop and remove the containers and the network that you created in the first docker compose command.

$ docker-compose down
Stopping consul-demo-traffic-splitting_api_proxy_v1_1 ... done
Stopping consul-demo-traffic-splitting_web_envoy_1 ... done
Stopping consul-demo-traffic-splitting_consul_1 ... done
Stopping consul-demo-traffic-splitting_web_1 ... done
Stopping consul-demo-traffic-splitting_api_v1_1 ... done
Removing consul-demo-traffic-splitting_api_proxy_v1_1 ... done
Removing consul-demo-traffic-splitting_web_envoy_1 ... done
Removing consul-demo-traffic-splitting_consul_1 ... done
Removing consul-demo-traffic-splitting_web_1 ... done
Removing consul-demo-traffic-splitting_api_v1_1 ... done
Removing network consul-demo-traffic-splitting_vpcbr


In this blog, we walked you through the steps required to perform Canary deployments using traffic splitting and resolution. For more in-depth information on Canary deployments, Danilo Sato has written an excellent article on Martin Fowler's website.

The advanced L7 traffic management in 1.6.0 is not limited to splitting. It also includes HTTP based routing and new settings for service resolution. In combination, these features enable sophisticated traffic routing and service failover. All the new L7 traffic management settings can be found in the documentation. If you’d like to go farther, combine it with our guide on and L7 Observability to implement some of the monitoring needed for new service deployments.

Please keep in mind that Consul 1.6 RC isn’t suited for production deployments. We’d appreciate any feedback or bug reports you have in our GitHub issues, and you’re welcome to ask questions in our new community forum.

from Hashicorp Blog

AWS Enhances, Expands Reach of IoT Device Defender

AWS Enhances, Expands Reach of IoT Device Defender


AWS Enhances, Expands Reach of IoT Device Defender

Amazon Web Services Inc. (AWS) has beefed up the functionality and expanded the reach of its AWS IoT Device Defender security service for the Internet of Things space.

Launched a year ago, AWS IoT Device Defender is a fully managed service that helps organizations secure their fleets of IoT devices by continually auditing IoT configurations to ensure that they aren’t deviating from security best practices. IoT configurations are organization-specified technical controls to help keep information secure when devices are communicating among one another and the cloud.

AWS last week announced that IoT Device Defender’s reach has expanded to two additional regions: EU (Paris) and EU (Stockholm). That brings the total number of AWS regions in which IoT Device Defender is available to 15.

The regional expansion came just two days after a functionality enhancement.

“AWS IoT Device Defender now supports the ability for customers to apply mitigation actions to audit findings,” AWS said. “This feature enables customers to use predefined mitigation actions or customize them and apply them at scale. With this release, customers can choose from the following set of predefined mitigation actions to automate a response to findings from an audit: add things to thing group, enable IoT logging, publish to SNS topic, replace default policy version, update CA certificate, and update device certificate. You can use mitigation actions by using the AWS IoT Console, AWS Command Line Interface (CLI) or APIs.”

The cloud giant has been busy updating IoT Device Defender this year after last year’s general availability debut, having added statistical anomaly detection and data visualization in February and incorporating support for monitoring the behavior of unregistered devices in May.

About the Author

David Ramel is the editor of Visual Studio Magazine.

from News