Tag: SDK

Authenticate applications through facial recognition with Amazon Cognito and Amazon Rekognition

Authenticate applications through facial recognition with Amazon Cognito and Amazon Rekognition

With increased use of different applications, social networks, financial platforms, emails and cloud storage solutions, managing different passwords and credentials can become a burden. In many cases, sharing one password across all these applications and platforms is just not possible. Different security standards may be required, such as passwords composed by only numeric characters, password renewal policies, and providing security questions.

But what if you could enhance the ways users authenticate themselves in their application in a more convenient, simpler and above everything, more secure way? In this post, I will show how to leverage Amazon Cognito user pools to customize your authentication flows and allow logging into your applications using Amazon Rekognition for facial recognition using a sample application.

Solution Overview

We will build a Mobile or Web application that allows users to sign in using an email and require the user to upload a document containing his or her photo. We will use the AWS Amplify Framework to integrate our Front-End application with Amazon S3 and store this image in a secure and encrypted bucket.  Our solution will trigger a Lambda function for each new image uploaded to this bucket so that we can index the images inside Amazon Rekognition and save the metadata in a DynamoDB table for later queries.

For authentication, this solution uses Amazon Cognito User Pools combined with Lambda functions to customize the authentication flows together with the Amazon Rekognition CompareFaces API to identify the confidence level between user photos provided during Sign Up and Sign In. Here is the architecture of the solution:

Here’s a step-wise description of the above data-flow architecture diagram:

  1. User signs up into the Cognito User Pool.
  2. User uploads – during Sign Up – a document image containing his/her photo and name, to an S3 Bucket (e.g. Passport).
  3. A Lambda function is triggered containing the uploaded image as payload.
  4. The function first indexes the image in a specific Amazon Rekognition Collection to store these user documents.
  5. The same function then persists in a DynamoDB table as the indexed image metadata, together with the email registered in Amazon Cognito User Pool for later queries.
  6. User enters an email in the custom Sign In page, which makes a request to Cognito User Pool.
  7. Amazon Cognito User Pool triggers the “Define Auth Challenge” trigger that determines which custom challenges are to be created at this moment.
  8. The User Pool then invokes the “Create Auth Challenge” trigger. This trigger queries the DynamoDB table for the user containing the given email id to retrieve its indexed photo from the Amazon Rekognition Collection.
  9. The User Pool invokes the “Verify Auth Challenge” trigger. This verifies if the challenge was indeed successfully completed; if it finds an image, it will compare it with the photo taken during Sign In to measure its confidence between both the images.
  10. The User Pool, once again, invokes the “Define Auth Challenge” trigger that verifies if the challenge was answered. No no further challenges are created, if the ‘Define Auth challenge’ is able to verify the user-supplied answer. The trigger response, back to the User Poll will include an “issueTokens:true” attribute to authenticate itself and finally issue the user a JSON Web Token (JWT) (see step 6).

Serverless Application and the different Lambdas invoked

The following solution is available as a Serverless application. You can deploy it directly from AWS Serverless Application Repository. Core parts of this implementation are:

  • Users are required to use a valid email as user name.
  • The solution includes a Cognito App Client configured to “Only allow custom authentication” as Amazon Cognito requires a password for user sign up. We are creating a random password to these users, since we don’t want them to Sign In using these passwords later.
  • We use two Amazon S3 Buckets: one to store document images uploaded during Sign Up and one to store user photos taken when Signing In for face comparisons.
  • We use two different Lambda runtimes (Python and Node.js) to demonstrate how AWS Serverless Application Model (SAM) handles multiple runtimes in the same project and development environment for the developer’s perspective.

The following Lambda functions are triggered to implement the images indexing in Amazon Rekognition and customize Amazon Cognito User Pools custom authentication challenges:

  1. Create Rekognition Collection (Python 3.6) – This Lambda function gets triggered only once, at the beginning of deployment, to create a Custom Collection in Amazon Rekognition to index documents for user Sign Ups.
  2. Index Images (Python 3.6) – This Lambda function gets triggered for each new document upload to Amazon S3 during Sign Up and indexes the uploaded document in the Amazon Rekognition Collection (mentioned in the previous step) and then persists its metadata into DynamoDB.
  3. Define Auth Challenge (Node.js 8.10) – This Lambda function tracks the custom authentication flow, which is comparable to a decider function in a state machine. It determines which challenges are presented, in what order, to the user. At the end, it reports back to the user pool if the user succeeded or failed authentication. The Lambda function is invoked at the start of the custom authentication flow and also after each completion of the “Verify Auth Challenge Response” trigger.
  4. Create Auth Challenge (Node.js 8.10) – This Lambda function gets invoked, based on the instruction of the “Define Auth Challenge” trigger, to create a unique challenge for the user. We will use this function to query DynamoDB for existing user records and if their given metadata are valid.
  5. Verify Auth Challenge Response (Node.js 8.10) – This Lambda function gets invoked by the user pool when the user provides the answer to the challenge. Its only job is to determine if that answer is correct. In this case, it compares both images provided during Sign Up and Sign In, using the Amazon Rekognition CompareFaces API and considers a API responses containing a confidence level equals or greater than 90% as a valid challenge response.

In the sections below, let’s step through the code for the different Lambda functions we described above.

1. Create an Amazon Rekognition Collection

As described above, this function creates a Collection in Amazon Rekognition that will later receive user photos uploaded during Sign Up.

import boto3
import os

def handler(event, context):

    maxResults=1
    collectionId=os.environ['COLLECTION_NAME']
    
    client=boto3.client('rekognition')

    #Create a collection
    print('Creating collection:' + collectionId)
    response=client.create_collection(CollectionId=collectionId)
    print('Collection ARN: ' + response['CollectionArn'])
    print('Status code: ' + str(response['StatusCode']))
    print('Done...')
    return response

2. Index Images into Amazon Rekognition

This function is responsible for receiving uploaded images during the sign up from users and index the images in the Amazon Rekognition Collection created by the Lambda described above to persist its metadata in an Amazon Dynamodb table.

from __future__ import print_function
import boto3
from decimal import Decimal
import json
import urllib
import os

dynamodb = boto3.client('dynamodb')
s3 = boto3.client('s3')
rekognition = boto3.client('rekognition')

# --------------- Helper Functions ------------------

def index_faces(bucket, key):

    response = rekognition.index_faces(
        Image={"S3Object":
            {"Bucket": bucket,
            "Name": key}},
            CollectionId=os.environ['COLLECTION_NAME'])
    return response
    
def update_index(tableName,faceId, fullName):
    response = dynamodb.put_item(
        TableName=tableName,
        Item={
            'RekognitionId': {'S': faceId},
            'FullName': {'S': fullName}
            }
        ) 
    
# --------------- Main handler ------------------

def handler(event, context):

    # Get the object from the event
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(
        event['Records'][0]['s3']['object']['key'].encode('utf8'))

    try:

        # Calls Amazon Rekognition IndexFaces API to detect faces in S3 object 
        # to index faces into specified collection
        
        response = index_faces(bucket, key)
        
        # Commit faceId and full name object metadata to DynamoDB
        
        if response['ResponseMetadata']['HTTPStatusCode'] == 200:
            faceId = response['FaceRecords'][0]['Face']['FaceId']
            ret = s3.head_object(Bucket=bucket,Key=key)
            email = ret['Metadata']['email']
            update_index(os.environ['COLLECTION_NAME'],faceId, email) 
        return response
    except Exception as e:
        print("Error processing object {} from bucket {}. ".format(key, bucket))
        raise e

3. Define Auth Challenge Function

This is the decider function that manages the authentication flow. In the session array that’s provided to this Lambda function (event.request.session), the entire state of the authentication flow is present. If it’s empty, it means the custom authentication flow just started. If it has items, the custom authentication flow is underway, i.e. a challenge was presented to the user, the user provided an answer, and it was verified to be right or wrong. In either case, the decider function has to decide what to do next:

exports.handler = async (event, context) => {

    console.log("Define Auth Challenge: " + JSON.stringify(event));

    if (event.request.session &&
        event.request.session.length >= 3 &&
        event.request.session.slice(-1)[0].challengeResult === false) {
        // The user provided a wrong answer 3 times; fail auth
        event.response.issueTokens = false;
        event.response.failAuthentication = true;
    } else if (event.request.session &&
        event.request.session.length &&
        event.request.session.slice(-1)[0].challengeResult === true) {
        // The user provided the right answer; succeed auth
        event.response.issueTokens = true;
        event.response.failAuthentication = false;
    } else {
        // The user did not provide a correct answer yet; present challenge
        event.response.issueTokens = false;
        event.response.failAuthentication = false;
        event.response.challengeName = 'CUSTOM_CHALLENGE';
    }

    return event;
}

4. Create Auth Challenge Function

This function queries DynamoDB for a record containing the given e-mail to retrieve its image ID inside Amazon Rekognition Collection and define as a challenge that the user must provide a photo that relates to the same person.

const aws = require('aws-sdk');
const dynamodb = new aws.DynamoDB.DocumentClient();

exports.handler = async (event, context) => {

    console.log("Create auth challenge: " + JSON.stringify(event));

    if (event.request.challengeName == 'CUSTOM_CHALLENGE') {
        event.response.publicChallengeParameters = {};

        let answer = '';
        // Querying for Rekognition ids for the e-mail provided
        const params = {
            TableName: process.env.COLLECTION_NAME,
            IndexName: "FullName-index",
            ProjectionExpression: "RekognitionId",
            KeyConditionExpression: "FullName = :userId",
            ExpressionAttributeValues: {
                ":userId": event.request.userAttributes.email
            }
        }
        
        try {
            const data = await dynamodb.query(params).promise();
            data.Items.forEach(function (item) {
                
                answer = item.RekognitionId;

                event.response.publicChallengeParameters.captchaUrl = answer;
                event.response.privateChallengeParameters = {};
                event.response.privateChallengeParameters.answer = answer;
                event.response.challengeMetadata = 'REKOGNITION_CHALLENGE';
                
                console.log("Create Challenge Output: " + JSON.stringify(event));
                return event;
            });
        } catch (err) {
            console.error("Unable to query. Error:", JSON.stringify(err, null, 2));
            throw err;
        }
    }
    return event;
}

5. Verify Auth Challenge Response Function

This function verifies within Amazon Rekognition if it can find an image with the confidence level equals or over 90% compared to the image uploaded during Sign In, and if this image refers to the user claims to be through the given e-mail address.

var aws = require('aws-sdk');
var rekognition = new aws.Rekognition();

exports.handler = async (event, context) => {

    console.log("Verify Auth Challenge: " + JSON.stringify(event));
    let userPhoto = '';
    event.response.answerCorrect = false;

    // Searching existing faces indexed on Rekognition using the provided photo on s3

    const objectName = event.request.challengeAnswer;
    const params = {
        "CollectionId": process.env.COLLECTION_NAME,
        "Image": {
            "S3Object": {
                "Bucket": process.env.BUCKET_SIGN_UP,
                "Name": objectName
            }
        },
        "MaxFaces": 1,
        "FaceMatchThreshold": 90
    };
    try {
        const data = await rekognition.searchFacesByImage(params).promise();

        // Evaluates if Rekognition was able to find a match with the required 
        // confidence threshold

        if (data.FaceMatches[0]) {
            console.log('Face Id: ' + data.FaceMatches[0].Face.FaceId);
            console.log('Similarity: ' + data.FaceMatches[0].Similarity);
            userPhoto = data.FaceMatches[0].Face.FaceId;
            if (userPhoto) {
                if (event.request.privateChallengeParameters.answer == userPhoto) {
                    event.response.answerCorrect = true;
                }
            }
        }
    } catch (err) {
        console.error("Unable to query. Error:", JSON.stringify(err, null, 2));
        throw err;
    }
    return event;
}

The Front End Application

Now that we’ve stepped through all the Lambdas, let’s create a custom Sign In page, in order to orchestrate and test our scenario. You can use AWS Amplify Framework to integrate your Sign In page to Amazon Cognito and the photo uploads to Amazon S3.

The AWS Amplify Framework allows you to implement your application using your favourite framework (React, Angular, Vue, HTML/JavaScript, etc). You can customize the snippets below as per your requirements. The snippets below demonstrate how to import and initialize AWS Amplify Framework on React:

import Amplify from 'aws-amplify';

Amplify.configure({
  Auth: {
    region: 'your region',
    userPoolId: 'your userPoolId',
    userPoolWebClientId: 'your clientId',
  },
  Storage: { 
    region: 'your region', 
    bucket: 'your sign up bucket'
  }
});

Signing Up

For users to be able to sign themselves up, as mentioned above, we will “generate” a random password on their behalf since it is required by Amazon Cognito for user sign up. However, once we create our Cognito User Pool Client, we ensure that authentication only happens following the custom authentication flow – never using user and password.

import { Auth } from 'aws-amplify';

signUp = async event => {
  const params = {
    username: this.state.email,
    password: getRandomString(30),
    attributes: {
      name: this.state.fullName
    }
  };
  await Auth.signUp(params);
};

function getRandomString(bytes) {
  const randomValues = new Uint8Array(bytes);
  window.crypto.getRandomValues(randomValues);
  return Array.from(randomValues).map(intToHex).join('');
}

function intToHex(nr) {
  return nr.toString(16).padStart(2, '0');
}

Signing in

Starts the custom authentication flow to the user.

import { Auth } from "aws-amplify";

signIn = () => { 
    try { 
        user = await Auth.signIn(this.state.email);
        this.setState({ user });
    } catch (e) { 
        console.log('Oops...');
    } 
};

Answering the Custom Challenge

In this step, we open the camera through Browser to take a user photo and then upload it to Amazon S3, so we can start the face comparison.

import Webcam from "react-webcam";

// Instantiate and set webcam to open and take a screenshot
// when user is presented with a custom challenge

/* Webcam implementation goes here */


// Retrieves file uploaded to S3 and sends as a File to Rekognition 
// as answer for the custom challenge
dataURLtoFile = (dataurl, filename) => {
  var arr = dataurl.split(','), mime = arr[0].match(/:(.*?);/)[1],
      bstr = atob(arr[1]), n = bstr.length, u8arr = new Uint8Array(n);
  while(n--){
      u8arr[n] = bstr.charCodeAt(n);
  }
  return new File([u8arr], filename, {type:mime});
};

sendChallengeAnswer = () => {

    // Capture image from user camera and send it to S3
    const imageSrc = this.webcam.getScreenshot();
    const attachment = await s3UploadPub(dataURLtoFile(imageSrc, "id.png"));
    
    // Send the answer to the User Pool
    const answer = `public/${attachment}`;
    user = await Auth.sendCustomChallengeAnswer(cognitoUser, answer);
    this.setState({ user });
    
    try {
        // This will throw an error if the user is not yet authenticated:
        await Auth.currentSession();
    } catch {
        console.log('Apparently the user did not enter the right code');
    }
    
};

Conclusion

In this blog post, we implemented an authentication mechanism using facial recognition using the custom authentication flows provided by Amazon Cognito combined with Amazon Rekognition. Depending on your organization and workload security criteria and requirements, this scenario might work from both security and user experience point of views. Additionally, we can enhance the security factor by chaining multiple Auth Challenges not only based on the user photo, but also a combination of Liveness Detection, a combination of their document numbers used for signing up and other additional MFA’s.

Since this is an entirely Serverless-based solution, you can customize it as your requirements arise using AWS Lambda functions. You can read more on custom authentication in our developer guide.

Resources

  • All the resources from the implementation mentioned above are available at GitHub. You can clone, change, deploy and run it yourself.
  • You can deploy this solution directly from the AWS Serverless Application Repository.

About the author

Enrico is a Solutions Architect at Amazon Web Services. He works in the Enterprise segment and works helping customers from different business leveraging their Cloud Journey. With more than 10 years working in Solutions Architecture and Engineering, and DevOps, Enrico acted directly with many customers designing, implementing and deploying several enterprise solutions.

 

from AWS Developer Blog https://aws.amazon.com/blogs/developer/authenticate-applications-through-facial-recognition-with-amazon-cognito-and-amazon-rekognition/

The AWS CLI and AWS SDK for Python will require Python 2.7+ or 3.4+ as their Python runtime

The AWS CLI and AWS SDK for Python will require Python 2.7+ or 3.4+ as their Python runtime

On January 10, 2020, in order to continue supporting our customers with tools that are secure and maintainable, AWS will release version 1.17 of AWS CLI and AWS SDK for Python version 1.13 (Botocore) and 1.10 (Boto3). These versions will require Python 2.7+ or Python 3.4+ runtime.

Per PSF (Python Software Foundation), Python 2.6.9 was “the final security-only source-only maintenance release of the Python 2.6 series”. With its release on October 29, 2013, PSF states that “all official support for Python 2.6 ended and was no longer being maintained for any purpose”. Per PSF, as of September 29, 2017, Python 3.3.x also reached end-of-life status.

Until this year many industry Python projects and packages continued to support Python 2.6 and Python 3.3 as their runtime. However, currently these projects or package owners have stopped their support for Python 2.6 and Python 3.3 as their runtime. Additionally, the Python Windows Installers for Python 2.6/3.3, have not updated their bundled OpenSSL since Python 2.6/3.3 EOL and cannot support TLSv1.2+. Many AWS APIs required TLSv1.2+ to access their services.

I’m currently using Python 2.6 or Python 3.3 as my runtime for AWS CLI or AWS SDK for Python. What should I do?

We recommend moving to a newer version of the Python runtime, either 2.7+ or 3.4. These can be found at https://www.python.org/downloads.

If you are using the AWS CLI with Python 2.6 or 3.3 and are not ready to upgrade to a newer Python version, then you will need to take one of the below actions depending upon your installation method.

MSI Installer
If you install the AWS CLI using the Windows MSI Installer, you are not impacted by this deprecation and no changes are required.

Pip
If you install the AWS CLI or the AWS SDK for Python using Pip, ensure that your pip invocation or requirements.txt file installs “awscli<1.17“, such as:

$ pip install –upgrade –user awscli<1.17

Bundled Installer
If you install the AWS CLI using the bundled installer, you must ensure that you download a copy of the bundled installer that supports Python 2.6+ or 3.3+ runtime. You can do this by downloading the file from “https://s3.amazonaws.com/aws-cli/awscli-bundle-{VERSION}.zip“, replacing “{VERSION}“ with the desired version of the CLI. For example to download version 1.6.188 use:

$ curl https://s3.amazonaws.com/aws-cli/awscli-bundle-1.16.188.zip -o awscli-bundle.zip

Then continue following the installation instructions found in https://docs.aws.amazon.com/cli/latest/userguide/install-bundle.html, starting with step 2.

For additional help or questions go to the CLI user guide.

from AWS Developer Blog https://aws.amazon.com/blogs/developer/deprecation-of-python-2-6-and-python-3-3-in-botocore-boto3-and-the-aws-cli/

The AWS SDK for Java will no longer support Java 6

The AWS SDK for Java will no longer support Java 6

The AWS SDK for Java currently maintains two major versions: 1.11.x and 2.x. Customers on Java 8 or newer may use either 2.x or 1.11.x, and customers on Java 6 or newer may use 1.11.x.

Free updates to the Java 6 virtual machine (JVM) were stopped by Oracle on April 2013. Users that don’t pay for extended JVM support would need to upgrade their JVM to continue to receive any updates, including security updates. As of December 2018, Oracle no longer provided extended support for Java 6. Additionally Jackson, a popular library for JSON serialization, is used by the AWS SDK for Java and in July 2016, portions of the Jackson library that are used by the AWS SDK stopped supporting Java 6. Therefore, as of November 15, 2019, new versions of AWS SDK for Java 1.11.x will be released without support for Java version 6, and will instead require Java version 7 or newer. After this date, customers on Java 6 that upgrade their version of the AWS SDK for Java will receive “Java version mismatch” errors at runtime.

I’m currently using Java 6 and the AWS SDK for Java 1.11.x. What should I do?

We recommend moving to a newer Java runtime that still supports free updates. Here are some popular choices:

  1. Amazon Corretto 8 or 11
  2. Red Hat OpenJDK 8 or 11
  3. OpenJDK 11
  4. AdoptOpenJDK 8 or 11

If you are not ready to update to a newer Java version, then you can pin your AWS SDK for Java version to one that supports Java 6, which will continue to work. However, you will no longer receive new service updates, bug fixes or security fixes.

Why will Java 6 no longer be supported?

As noted previously, free updates to the Java 6 virtual machine (JVM) were stopped by Oracle on April 2013. Free users would need to upgrade their JVM to continue to receive security updates. As of December 2018, Oracle no longer provides extended support for Java 6.

The AWS SDK for Java uses a small number of industry-standard dependencies. These dependencies provide the SDK with a larger feature set than would be possible if the functionality provided by these dependencies were to be developed in-house. Because Java 6 is now generally considered “unsupported”, many third party libraries have stopped supporting Java 6 as a runtime.

For example, Jackson, a popular library for JSON serialization, is used by the AWS SDK for Java as well as many other libraries in the Java ecosystem. In July 2016, portions of the Jackson library that are used by the AWS SDK stopped supporting Java 6. At the time, too many AWS customers would have been broken by removal of support for Java 6, and Jackson was too ingrained in the SDK’s public APIs to be removed without breaking a different set of customers. The AWS SDK for Java team froze the version of Jackson that they used and made sure that the Jackson features used by the SDK were not affected by known security issues.

Many things have changed since 2016, including: Java 6 is now generally considered “unsupported”, very few AWS customers use Java 6 on the updated AWS SDK for Java, and customers have begun to report that the old version of Jackson in their dependency graph is an issue. To maintain our customer focus, we will be raising the minimum Java version to Java 7 for the AWS SDK for Java and upgrading to use a newer version of Jackson.

from AWS Developer Blog https://aws.amazon.com/blogs/developer/the-aws-sdk-for-java-will-no-longer-support-java-6/

Amplify CLI simplifies starting from existing Amplify projects and adds new command for extending CLI capabilities

Amplify CLI simplifies starting from existing Amplify projects and adds new command for extending CLI capabilities

The Amplify Framework is an open source project for building cloud-enabled mobile and web applications. Today, we are happy to announce new features in the Amplify CLI that simplify the developer experience by enabling developers to get started from an existing Amplify project hosted on a Git repo by using a single command.

The ‘amplify init’ command with a new ‘–app’ parameter clones the git repo, initializes the project, deploys the backend, and configures the frontend—whether a web, iOS, or Android application—to use the backend. For web applications, the command also launches the application in your default browser. This helps speed up the process of set up, development, and testing existing Amplify projects.

In addition, the Amplify CLI has now made it easy to add custom capabilities via plugins. A new ‘amplify plugin init’ command generates template for a plugin based on simple questions you answer. You can then extend these generated templates to add commands, events, and configuration for your plugin.

Getting started from an existing Amplify project

We will walkthrough an example of how you can use this new feature with an existing web application that was built using Amplify. We will use the Journey app hosted on a GitHub repo in our sample walkthrough.

Install and configure the Amplify CLI

$ npm install -g @aws-amplify/cli
$ amplify configure

The configure step will guide you through steps for creating a new IAM user. Select all default options.
If you already have the CLI configured, you do not need to run the configure command again.

Choose the Amplify app

For this blog post, we are using the Journey app which is built using React, AWS Amplify, GraphQL, AWS AppSync, and hosted on GitHub. You can use any other application that was built using Amplify. The repo for the application you use must include the following for this feature to work:

1. amplify/, .config/, and backend/ folders
2. ‘project-config.json’ in .config/ folder
3. ‘backend-config.json’ in backend/ folder
4. CloudFormation files in the backend/ folder. For e.g. ‘schema.graphql’ for API and cloudformation template for auth.

We pass the GitHub URL for the Journey app in the amplify init command as shown below:

$ amplify init –-app https://github.com/kkemple/journey

The init command clones the GitHub repo, initializes the CLI, creates a ‘sampledev’ environment in CLI, detects and adds categories, provisions the backend, pushes the changes to the cloud, and starts the app.

If you already have an AWS profile set up on your local machine, choose “Yes” when prompted by the CLI and select the profile you would like to use.

? Do you want to use an AWS profile? Yes
? Please choose the profile you want to use (Use arrow keys)
❯ default
  triggers
  getstarted

If you do not have an AWS profile set up on your local machine, you will be prompted by the CLI to set up a new AWS profile.

Provisioning the frontend and backend

Next, the CLI will detect the ‘amplify’ folder in the source code of your app and start provisioning the backend resources for your app using your AWS account and profile selected in the previous step. It will also detect and install frontend packages dependencies that the web application needs by reading the package.json file.

Once the process is complete, the CLI will automatically open the app in your default browser.

Auth

The Journey app has auth category (sign-in) added to its backend. You can add users to log-in using amplify auth console command. This will open up the Amazon Cognito console in your browser where you can add users and groups.

$ amplify auth console
? Which console
❯ User Pool
Identity Pool
Both

To add a user, go under General Settings –> Users and groups –> Create user. In the create user dialog box, uncheck the "Send an invitation to this new user?" option as shown below:

Restart your app after you add the users in the Cognito console using:

$ yarn start

 


As you can see, this simplifies the entire process of setting up an existing app built using Amplify. You no longer have to download the source code, figure out the categories, provision resources, push changes to the cloud, and then start your app. This helps speed up the development and testing process enabling you to focus more on the front end piece.

Custom capabilities in the Amplify CLI

The Amplify CLI has a pluggable architecture which allows you to write plugins that are one of the following types: category, provider, frontend, and util. To learn more about the types of plugins, refer to our documentation.

Before this release, creating a plugin for CLI was done by executing a series of manual steps which include creating directories for the plugin source code, adding code for commands and events, creating extensions as needed, testing your plugin, and publishing it to NPM. These steps can consume time overall due to more manual steps being involved compared to the new feature.

Now, the Amplify CLI plugin platform enables you to create and configure the entire skeleton of a plugin package with few commands.

In the example below, we will create a simple util plugin called my-amplify-utility and add a version command to it which outputs the version of our plugin. An util plugin typically does not manage any backend resources in the cloud but can be used by other plugins.

Add a new plugin

First, run the amplify plugin init command from an empty directory. Alternatively, you can use its alias amplify plugin new. This commands asks you a few simple questions which are used as input to create the skeleton of a new plugin package. Once created, this plugin will show up in your local installation of Amplify CLI. You can view all of the active plugins by using the amplify plugin list command.

$ amplify plugin init
? What should be the name of the plugin: my-amplify-plugin
? Specify the plugin type util
? What Amplify CLI events do you want the plugin to handle? (Press <space> to select, <a> to toggle all, <i> to invert selection)
❯◉ PreInit
 ◉ PostInit
 ◉ PrePush
 ◉ PostPush
 ◯ Learn more
The plugin package my-amplify-plugin has been successfully setup.

Once the above command completes, it will create a directory called ‘my-amplify-plugin’ in the folder where you ran the command. Let’s take a look what’s inside this folder:

my-amplify-plugin/
    - amplify-plugin.json
    - commands/
    - event-handlers/
    - index.js
    - package.json

amplify-plugin.json is the plugin manifest file which contains the plugin name, type, commands, and event handlers.

commands/ folder contains the code for each command you add to the plugin.

event-handlers/ folder contains the event handlers you specify in the amplify-plugin.json file.

index.js is the entry point of the plugin invoked by the Amplify CLI.

package.json contains the package information about the plugin.

Update version.js

The init command automatically creates templates for few commands such as help and version. These templates can be found at: command/help.js and commands/version.js

Modify the version.js file to say “Version 1.0.0”

async function run(context) {
  // print out the version of your plugin package
  context.print.info('Version 1.0.0');
}

module.exports = {
  run,
};

Install the plugin package

Once you have updated the version.js file, install the plugin package locally by running the following command from the root of the plugin package folder which in our case is my-amplify-plugin:

$ npm install -g

Run the plugin

To use the plugin, run the following command:

$ amplify my-amplify-plugin version
Version 1.0.0

As seen above, it prints out the version of the plugin.

Publish and use the plugin

You can publish the plugin to NPM using the instructions here.
Once your plugin is published, other developers can install and use it using the following commands:

$ npm install -g my-amplify-plugin
$ amplify plugin add my-amplify-plugin
$ amplify my-amplify-plugin version

To learn more about the complete plugin platform and commands, visit our documentation.

Feedback

We hope you like these new features! Let us know how we are doing, and submit any feedback in the Amplify Framework Github Repository. You can read more about AWS Amplify on the AWS Amplify website.

from AWS Mobile Blog

Removing the vendored version of requests from Botocore

Removing the vendored version of requests from Botocore

We’d like to give additional visibility to an upcoming change to Botocore, a dependency on Boto3, the AWS SDK for Python. Starting 10/21/19, we will be removing the vendored version of the requests library in Botocore. In this post, we’ll cover the key details.

In August of last year, we made significant improvements to the internals of Botocore to allow for pluggable HTTP clients. A key part of the internal refactoring was changing the HTTP client library from the requests library to urllib3. As part of this change, we also decided to unvendor our HTTP library. This allows us to support a range of versions of urllib3 instead of requiring us to depend on a specific version. This meant that we no longer used the vendored version of requests in Botocore and we could remove this unused code. See the GitHub pull request for a more information.

If you’re using the vendored version of requests in Botocore, you’ll see the following warning:

./botocore/vendored/requests/api.py:67: DeprecationWarning: You are using the get() function from 'botocore.vendored.requests'.
This is not a public API in botocore and will be removed in the future. Additionally, this version of requests is out of date.
We recommend you install the requests package, 'import requests' directly, and use the requests.get() function instead.
DeprecationWarning

You can migrate away from this by installing requests into your python environment and importing requests directly:

Before

from botocore.vendored import requests
response = requests.get('https://...')

After

$ pip install requests
import requests
response = requests.get('https://...')

The associated PR https://github.com/boto/botocore/pull/1829 has a branch you can use to test this change before its merged into an official release.

Please let us know if you have any questions or concerns in the GitHub pull request.

from AWS Developer Blog https://aws.amazon.com/blogs/developer/removing-the-vendored-version-of-requests-from-botocore/

Working with the AWS Cloud Development Kit and AWS Construct Library

Working with the AWS Cloud Development Kit and AWS Construct Library

The AWS Cloud Development Kit (CDK) is a software development framework for defining your cloud infrastructure in code and provisioning it through AWS CloudFormation. The AWS CDK allows developers to define their infrastructure in familiar programming languages such as TypeScript, Python, C# or Java, taking advantages of the features those languages provide.

When I worked as an AWS Solutions Architect with Digital Native Businesses in the UK. I worked directly with many companies that tend to build their own solutions to the problems they encounter and commonly embrace Infrastructure-as-Code practices.

When I speak with these customers about the AWS CDK, the most common question I get is, “how much of AWS CloudFormation is covered by the AWS CDK?” The short answer is all of it. The long answer, as we explore in this post, is more nuanced and requires understanding the different layers of abstraction in the AWS Construct Library.

Layers in the AWS Construct Library

The AWS CDK includes the AWS Construct Library, a broad set of modules that expose APIs for defining AWS resources in CDK applications. Each module in this library contains constructs, the basic building blocks for CDK apps, that encapsulate everything CloudFormation needs to create AWS resources. There are three different levels of CDK constructs in the library: CloudFormation Resource Constructs, AWS Constructs, and Pattern Constructs. CloudFormation Resource Constructs and AWS Constructs are packaged together in the same module and named after the AWS service they represent, aws-s3 for example. Pattern Constructs are packaged in their own module and have the patterns suffix, like aws-ecs-patterns.

CloudFormation Resource Constructs are the lowest-level constructs. They mirror the AWS CloudFormation Resource Types and are updated with each release of the CDK. This means that you can use the CDK to define any resource that is available to AWS CloudFormation and can expect them to be up-to-date. When you use CloudFormation Resources, you must explicitly configure all of the resource’s properties, which requires you to completely understand the details of the underlying resource model. You can quickly identify this construct layer by looking for the ‘Cfn’ prefix. If it starts with those three letters, then it is a CloudFormation Resource Construct and maps directly to the resource type found in the CloudFormation reference documentation.

AWS Constructs also represent AWS services and leverage CloudFormation Resource Constructs under-the-hood, but they provide a higher-level, intent-based API. They are designed specifically for the AWS CDK and handle much of the configuration details and boilerplate logic required by the CloudFormation Resources. AWS Constructs offer proven default values and provide convenient methods that make it simpler to work with the resource, reducing the need to know all the details about the CloudFormation resources they represent. In some cases, these constructs are contributed by the open source community and reviewed by the AWS CDK team for inclusion in the library. A good example of this is the Amazon Virtual Private Cloud (VPC) construct which I cover in more detail below.

Finally, the library includes even higher-level constructs, which are called Pattern Constructs. These constructs generally represent reference architectures or design patterns that are intended to help you complete common tasks in AWS. For example, the aws-ecs-patterns.LoadBalancedFargateService construct represents an architecture that includes an AWS Fargate container cluster that sits behind an Elastic Load Balancer (ELB). The aws-apigateway.LambdaRestApi construct represents an Amazon API Gateway API that’s backed by an AWS Lambda function.

My expectation is for you to use the highly abstracted AWS Constructs and Patterns Constructs whenever possible because of the convenience and time savings they provide. However, the CDK is new and AWS service coverage at these upper layers is not yet complete. What do you do when high-level service coverage is absent for your CDK use cases? In the remainder of this post I will teach you how to use the CloudFormation Resource layer when AWS Constructs are not available. I will also show you how to use “overrides” for situations where a high-level AWS Construct is available, but a specific underlying CloudFormation property you want to configure is not directly exposed in the API.

Prerequisites

You will need an active AWS account if you want to use the examples detailed in this post. You’ll be using only a few items that have an hourly billing figure – specifically NAT Gateways. Please check the pricing pages for this feature to understand any costs that may be incurred.

A basic understanding of a Terminal/CLI environment is recommended to run through everything here, but even without it, following along to learn the concepts should be fine.

First, follow the Getting Started guide to set up your computer or AWS Cloud9 environment to use the AWS CDK. It will also guide you on initializing new templates which you’ll be doing a few times in this post.

You will also need a text editor. If you’re working within AWS Cloud9, you have one provided within the console. I used Visual Studio Code, which has Typescript support built in the default install, to write this post. If using Visual Studio Code, I recommend using the EditorConfig for VS Code extension as it will handle different file formats and their space/tab requirements automatically for you.

Building a VPC with the CloudFormation Resource Construct layer

You will start by building a very basic VPC using CFN Resource constructs only.

Walkthrough

  1. Open up a Terminal in the environment you configured in Getting Started
  2. You will call this Stack vpc, so make a folder named `vpc` and change to this folder in your terminal
  3. Then initialize a new AWS CDK project using the following command:

cdk init app --language=typescript

4. Now, install the aws-ec2 library which contains both the CloudFormation Resource layer and AWS layer constructs for VPCs.

npm install @aws-cdk/aws-ec2

5. In your favorite text editor, open up the vpc-stack.ts file in the lib folder

6. The structure is pretty straightforward, you have imports at the top – these are similar to imports in Python or Java where you can include different libraries and features and the AWS CDK template puts in a convenient method to start defining our stack, highlighted by the comment.

7. Import the ec2 library by adding the import statement on the second line:

import ec2 = require('@aws-cdk/aws-ec2');

8. Now define your VPC using the following code where it says ‘The code that defines your stack goes here’:

new ec2.CfnVPC(this, "MyVPC",{
      cidrBlock: "10.0.0.0/16",
    });

9. Your code should now look like this:

10. Next you need to compile your TypeScript code to JavaScript for the CDK tool to then deploy it. You’ll first run the build command to do this, and then go ahead and deploy your stack. So, using your Terminal window again:

npm run build

> [email protected] build /Users/leepac/Code/_blog/vpc
> tsc

cdk deploy

11. Head to the AWS Console’s CloudFormation Section, and select VpcStack and then Resources. If the stack doesn’t appear, you might have the wrong region selected, and you can use the Region drop down to select the right one.

12. Click on the VPC resource Physical ID link and you get taken to the VPC Dashboard section of the Console. You can then filter the VPCs listed in the console by selecting the same Physical ID in the filter by VPC box:

13. In this filtered view, take a look at your VPC and click on Subnets in the left hand pane to see your subnets. You’ll find it empty – why?

Because you used a CloudFormation Resource Construct, a VPC Resource Type (the CfnVpc) only deploys an empty VPC rather than the resources needed to start deploying things like Amazon EC2 instances. You could start adding Subnets and the like to your VPC using the ‘Cfn*’ constructs available, but you need to work out things like the CIDRs and references yourself.

Next you will move up to the higher-level VPC construct and see what it does.

Moving to the higher-level AWS Construct layer

The Amazon EC2 Module of the AWS CDK provides a way to make a useable VPC with the same amount of code you wrote just to deploy a VPC object with no subnets. The CDK documentation covers all the various options – by default, it creates a Well-architected VPC with both Private and Public Subnets in up to three Availability Zones within a region with NAT Gateways in each AZ.

Walkthrough

1. Change the ‘CfnVPC’ to just `Vpc` and `cidrBlock` to `cidr`:

Go back to the Terminal and run the build and then look at what this will do by using the CDK’s `diff` command:

npm run build

> [email protected] build /Users/leepac/Code/_blog/vpc
> tsc

cdk diff

3. That’s a lot more stuff! The great thing with the ‘diff’ command is you can see quickly what the deploy will do when it goes to deploy a stack. Go ahead and deploy using the `cdk deploy` command – this will take up to 15 minutes to fully deploy as the NAT Gateways need to be fully provisioned.

4. Head over to the AWS Console’s CloudFormation section again and select the VpcStack and Resources – you’ll see there’s a lot more now!

5. Feel free to explore the VPC section of the console again by clicking on the VPC Physical ID link and having a browse. You now have a VPC that can be used for deploying resources.

As you can see, it’s worth looking at higher-level AWS Constructs! However, as an engineer, I always find high level abstractions are just that, abstractions. It can feel like I’m giving up the flexibility I get with lower level resources.

In the last section, you’ll find out how the AWS CDK gives you the flexibility to override the high-level abstractions when necessary.

Overriding parts of AWS Constructs

The AWS CDK provides a way to break out the AWS CloudFormation Constructs that make up AWS Construct resources to quickly extend functionality. This is best illustrated in a code example. In the lib folder of your CDK project, create a new file called s3-stack.ts, then copy or type the following code:

import cdk = require('@aws-cdk/core');
import s3 = require('@aws-cdk/aws-s3');

export class S3Stack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // Create a logging bucket
    const loggingBucket = new s3.Bucket(this, "LoggingBucket", {
      bucketName: "leerandomexamplelogging",
    });

    // Create my bucket to be logged
    const higherLevelBucket = new s3.Bucket(this, "MyBucket", {
      bucketName: "leerandomexamplebucket",
    });

    // Extract the CfnBucket from the L2 Bucket above
    const bucketResource = higherLevelBucket.node.findChild('Resource') as s3.CfnBucket;

    // Override logging configuration to point to my logging bucket
    bucketResource.addPropertyOverride('LoggingConfiguration', {
      "DestinationBucketName": loggingBucket.bucketName,
    });
  }
}

Because the AWS CDK builds a virtual tree of resources, you can take advantage of the findChild() method to traverse the tree of your higherLevelBucket to get the CloudFormation resource (in this case a CfnBucket). You can then use the addPropertyOverride method to set the specific property you wish to use. In this case, you add a LoggingConfiguration that points to your loggingBucket. The AWS CDK will take care of referencing for you – use the cdk synth subcommand to look at this in CloudFormation YAML format:

In the output you can see lines called aws:cdk:path: and this tells you quickly where to find the Cfn* type. This is how you find out that the CfnBucket is part of MyBucket and called Resource, giving you the parameter to pass into findChild().

Conclusions

Today you learned about the differences between AWS CloudFormation Constructs and higher-level AWS Constructs. You also learned that AWS CloudFormation Constructs are automatically generated and updated from the CloudFormation reference with the AWS Constructs being more curated with opinionated patterns.

You have also learned how to tell which type a Construct is by its prefix, with AWS CloudFormation Constructs being prefixed with ‘Cfn’. You then used one of these constructs, CfnVpc, to deploy a VPC and discovered that because of this exact mapping that you would need to use multiple constructs to build a VPC.

You then looked at the higher-level ‘Vpc’ AWS Construct and how it builds out a full VPC which contained both private and public subnets with NAT Gateways, a common deployment amongst customers looking to deploy Amazon EC2 applications.

 

 

from AWS Developer Blog https://aws.amazon.com/blogs/developer/working-with-the-aws-cloud-development-kit-and-aws-construct-library/

Running end-to-end Cypress tests for your fullstack CI/CD deployment with Amplify Console

Running end-to-end Cypress tests for your fullstack CI/CD deployment with Amplify Console

This article was written by Nikhil Swaminathan, Sr. Technical Product Manager, AWS.

Amplify Console now officially supports end-to-end (E2E) testing as part of your continuous deployment pipeline. E2E tests allow you to test your whole application from start to finish. Writing unit tests for the separate components of your app (e.g. product search flow, checkout flow) while useful, does not verify that the different components work together properly. E2E tests simulate real user scenarios and ensure all the pieces work together (e.g. searching for a product and purchasing it).

In this blog post, we are going to build a React app using Amplify’s Authentication UI, and add Cypress tests to verify the login flow.

Set up project locally

To get set up, create a new React app, install the Amplify CLI and Amplify JS libraries.

create-react-app amplify-cypress 
cd amplify-cypress
npm i @aws-amplify/cli
yarn add aws-amplify aws-amplify-react

Initialize your Amplify project and accept all defaults.

amplify init
? Enter a name for the environment: dev

Add authentication to your backend.

amplify add auth
amplify push

Copy and paste the following code to your App.js

  
import React, { Component } from 'react';
import logo from './logo.svg';
import './App.css';
import { withAuthenticator } from 'aws-amplify-react'
import Amplify, { Auth } from 'aws-amplify';
import aws_exports from './aws-exports';
Amplify.configure(aws_exports);

class App extends Component {
  render() {
    return (
      <div className="App">
        <header className="App-header">
          <img src={logo} className="App-logo" alt="logo" />
          <p>
            Edit <code>src/App.js</code> and save to reload.
          </p>
          <a
            className="App-link"
            href="https://reactjs.org"
            target="_blank"
            rel="noopener noreferrer"
          >
            Learn React
          </a>
        </header>
      </div>
    );
  }
}
export default withAuthenticator(App, true);

Run your project locally by running yarn start. You should see the Amplify Auth UI. Go ahead and create a dummy user (record the login information as we will use it later!).

Add E2E tests to your app with Cypress

Cypress is a popular JavaScript-based testing framework for running E2E tests in the browser. Install Cypress locally, and then run yarn run cypress open. This command launches the Cypress app which bootstraps a cypress folder in your repository with all the test spec files.

cd myapp && yarn add cypress --dev 
yarn run cypress open

Let’s add a test spec called authenticator_spec.js to test our sign-in flow.

touch cypress/integration/authenticator_spec.js

All tests cover three phases, for each phase we will test different functionality:

  1. Set up the application state: Visit the sign-in page.
  2. Take an action: Enter the information for the dummy user created above and click on Sign-in.
  3. Make an assertion:  If the resulting page contains the Sign-out button, then the user successfully signed-in.

Copy and paste the following code to the authenticator_spec.js. Please replace DUMMY_USERNAME and DUMMY_PASSWORD with the login details for the user you created above.

describe('Authenticator:', function() {
  // Step 1: setup the application state
  beforeEach(function() {
    cy.visit('/');
  });
  
  describe('Sign In:', () => {
    it('allows a user to signin', () => {
      // Step 2: Take an action (Sign in)
      cy.get(selectors.usernameInput).type("DUMMY_USERNAME");
      cy.get(selectors.signInPasswordInput).type("DUMMY_PASSWORD");
      cy.get(selectors.signInSignInButton).contains('Sign In').click();

      // Step 3: Make an assertion (Check for sign-out text)
        cy.get(selectors.signOutButton).contains('Sign Out');
    });
  });

});
export const selectors = {
  // Auth component classes
  usernameInput: '[data-test="username-input"]',
  signInPasswordInput: '[data-test="sign-in-password-input"]',
  signInSignInButton: '[data-test="sign-in-sign-in-button"]',
  signOutButton: '[data-test="sign-out-button"]'
}

Update the cypress.json file with { "baseUrl": "http://localhost:3000/" } to ensure the tests are run against your localhost port. Launch the Cypress app (yarn run cypress open) and run the test. This will launch a Chrome browser which executes the authenticator_spec.js. If you wrote the test correctly, you should see a screen with 1 success and 0 errors.

Set up continuous delivery with the Amplify Console

The Amplify Console provides a continuous deployment and hosting service for your fullstack app. Amplify Console hosting is fully managed offering features such as instant cache invalidation, atomic deployments, easy custom domain setup, and password protection. To get started, first push your code to a Git provider of your choice.

git commit -am 'Added auth ui'
git push origin master

Log in to the Amplify Console, pick your Git provider, repository and branch, and choose Next. The Amplify Console automatically detects your build settings by inspecting your repository and finding React, Cypress, and Amplify.

Amplify Console detects the existing backend you created with the Amplify CLI – connect the dev backend you created in the CLI to automatically deploy fullstack changes on every git push. Create a service role so Amplify Console can access your backend resources.

Review the build settings – you should notice a test phase with Cypress commands that are automatically detected by Amplify Console to execute tests. You can now save and deploy your app.

 

The Amplify Console deploys your app to an amplifyapp.com domain. When the build completes, you should see a successful pipeline, and be able to access your deployed site.

Choosing the Test icon will take you to the test report dashboard. This report contains a list of all the executed specs, with links to a recording for each test spec. These recordings are particularly useful when tests fail and you want to investigate the failure. The screenshot below shows a more comprehensive set of tests we run on our Amplify UI component.

Your app is now setup with CI/CD and E2E tests! You can build new features with E2E tests, and deploy updates with a git push. Every commit is regression tested – if your E2E tests fail for any reason, the Amplify Console will fail the deployment preventing regressions from being released to production.

Conclusion

This post showed you how to add backend functionality to your app with the Amplify CLI, write E2E tests with Cypress to test fullstack functionality, and finally set up continuous deployment and hosting for your app with the Amplify Console.

from AWS Mobile Blog

Automated Performance Regression Detection in the AWS SDK for Java 2.0

Automated Performance Regression Detection in the AWS SDK for Java 2.0

We are happy to share that we’ve added automated performance regression tests to the AWS SDK for Java 2.0. With this benchmark harness, every change to the SDK will be tested for performance before release, to avoid potential performance regressions. We understand that performance is critical to our customers and we’ve prioritized improving various performance aspects of the AWS SDK. In the past, we’ve relied on Pull Request reviews to catch any code changes that looked like would cause performance issues. With this approach, although rare, sometimes a simple line of code might get overlooked and end up causing performance issues. Also, at times, performance regressions were caused due to newer versions of SDK dependencies, and there was no easy way to monitor and quantify these performance impacts to the SDK. With the benchmark tests, we are now able to detect performance regressions with changes before they are merged into the master. The benchmark harness code is open-source and resides in the same repository as the AWS SDK for Java 2.0, and is implemented using the Java Microbenchmark Harness or JMH.

How to Run Benchmarks

To run the benchmarks, you need to build it first using mvn clean install -P quick -pl :sdk-benchmarks –am. Then, trigger the benchmarks using one of the following options:

Option 1:  Use the executable JAR

cd test/sdk-benchmarks
# Run a specific benchmark
java -jar target/benchmarks.jar ApacheHttpClientBenchmark
 
# Run all benchmarks: 3 warm up iterations, 3 benchmark iterations, 1 fork
java -jar target/benchmarks.jar -wi 3 -i 3 -f 1

Option 2: Use maven command to invoke BenchmarkRunner main method to run all benchmarks

mvn install -pl :bom-internal
cd test/sdk-benchmarks
mvn exec:exec

Option 3: Run the main method within each Benchmark class from your IDE

You can also run the main method within each Benchmark class from your IDE. If you are using Eclipse, you might need to set up build configurations for JMH annotations (check out the JMH page to learn how). You’ll note that per JMH recommendations, using Maven and executable JARs (options #1 & #2 above) are preferred over running it from within an IDE. IDE set up is a bit complex and could yield less reliable results.

How the Benchmark Harness Works

When the benchmark tool gets triggered, it first runs a set of predefined scenarios with different http clients sending requests to local mock servers. It then measures the throughput of the current revision and compares the results with the existing baseline results computed from the previously released version. The performance tests will fail if the throughput of the new change decreases by a certain threshold. Running the benchmark tests for every pull request allows us to block those with problematic changes. As an added bonus, the benchmark tool makes it easier to monitor the SDK performance over the time, and when performance improvement is made, we can inform customers with quantified performance gains so that they can benefit immediately from upgrading the SDK. Finally, the baseline data generated from the benchmark harness provides useful information, such as which HTTP client has the best performance or how many percentages of throughput gains it can achieve by tuning SDK configurations. For example, changing to use OpenSSL provider for NettyAsyncHttpClient can reach 10% more throughputs according to our benchmarks. With automated performance checks, we expect to limit unanticipated performance degradation with new version releases. For contributing, we also encourage you to run the benchmarks harness locally to check if there is a performance impact by the changes.

from AWS Developer Blog https://aws.amazon.com/blogs/developer/automated-performance-regression-detection-in-the-aws-sdk-for-java-2-0/

Testing infrastructure with the AWS Cloud Development Kit (CDK)

Testing infrastructure with the AWS Cloud Development Kit (CDK)

The AWS Cloud Development Kit (CDK) allows you to describe your application’s infrastructure using a general-purpose programming language, such as TypeScript, JavaScript or Python. This opens up familiar avenues for working with your infrastructure, such as using your favorite IDE, getting the benefit of autocomplete, creating abstractions in a familiar way, distributing them using your ecosystem’s standard package manager, and of course: writing tests for your infrastructure like you would write tests for your application.

In this blog post you will learn how to write tests for your infrastructure code in TypeScript using Jest. The code for JavaScript will be the same (sans the types), while the code for Python would follow the same testing patterns. Unfortunately, there are no ready-made Python libraries for you to use yet.

Approach

The pattern for writing tests for infrastructure is very similar to how you would write them for application code: you define a test case as you would normally do in the test framework of your choice. Inside that test case you instantiate constructs as you would do in your CDK app, and then you make assertions about the AWS CloudFormation template that the code you wrote would generate.

The one thing that’s different from normal tests are the assertions that you write on your code. The TypeScript CDK ships with an assertion library (@aws-cdk/assert) that makes it easy to make assertions on your infrastructure. In fact, all of the constructs in the AWS Construct Library that ship with the CDK are tested in this way, so we can make sure they do—and keep on doing—what they are supposed to do. Our assertions library is currently only available to TypeScript and JavaScript users, but will be made available to users of other languages eventually.

Broadly, there are a couple of classes of tests you will be writing:

  • Snapshot tests (also known as “golden master” tests). Using Jest, these are very convenient to write. They assert that the CloudFormation template the code generates is the same as it was when the test was written. If anything changes, the test framework will show you the changes in a diff. If the changes were accidental, you’ll go and update the code until the test passes again, and if the changes were intentional, you’ll have the option to accept the new template as the new “golden master”.
    • In the CDK itself, we also use snapshot tests as “integration tests”. Rather than individual unit tests that only look at the CloudFormation template output, we write a larger application using CDK constructs, deploy it and verify that it works as intended. We then make a snapshot of the CloudFormation template, that will force us to re-deploy and re-test the deployment if the generated template starts to deviate from the snapshot.
  • Fine-grained assertions about the template. Snapshot tests are convenient and fast to write, and provide a baseline level of security that your code changes did not change the generated template. The trouble starts when you purposely introduce changes. Let’s say you have a snapshot test to verify output for feature A, and you now add a feature B to your construct. This changes the generated template, and your snapshot test will break, even though feature A still works as intended. The snapshot can’t tell which part of the template is relevant to feature A and which part is relevant to feature B. To combat this, you can also write more fine-grained assertions, such as “this resource has this property” (and I don’t care about any of the others).
  • Validation tests. One of the advantages of general-purpose programming languages is that we can add additional validation checks and error out early, saving the construct user some trial-and-error time. You would test those by using the construct in an invalid way and asserting that an error is raised.

An example: a dead letter queue

Let’s say you want to write a DeadLetterQueue construct. A dead letter queue is used to hold another queue’s messages if they fail delivery too many times. It’s generally bad news if messages end up the dead letter queue, because it indicates something is wrong with the queue processor. To that end, your DeadLetterQueue will come with an alarm that fires if there are any items in the queue. It is up to the user of the construct to attach any actions to the alarm firing, such as notifying an SNS topic.

Start by creating an empty construct library project using the CDK CLI and install some of the construct libraries we’ll need:

$ cdk init --language=typescript lib
$ npm install @aws-cdk/aws-sqs @aws-cdk/aws-cloudwatch

The CDK code might look like this (put this in a file called lib/dead-letter-queue.ts):

import cloudwatch = require('@aws-cdk/aws-cloudwatch');
import sqs = require('@aws-cdk/aws-sqs');
import { Construct, Duration } from '@aws-cdk/core';

export class DeadLetterQueue extends sqs.Queue {
  public readonly messagesInQueueAlarm: cloudwatch.IAlarm;

  constructor(scope: Construct, id: string) {
    super(scope, id);

    // Add the alarm
    this.messagesInQueueAlarm = new cloudwatch.Alarm(this, 'Alarm', {
      alarmDescription: 'There are messages in the Dead Letter Queue',
      evaluationPeriods: 1,
      threshold: 1,
      metric: this.metricApproximateNumberOfMessagesVisible(),
    });
  }
}

Writing a test

You’re going to write a test for this construct. First, start off by installing Jest and the CDK assertion library:

$ npm install --save-dev jest @types/jest @aws-cdk/assert

You also have to edit package.json file in your project to tell NPM to run Jest, and tell Jest what kind of files to collect:

{
  ...
 "scripts": {
    ...
    "test": "jest"
  },
  "devDependencies": {
    ...
    "@types/jest": "^24.0.18",
    "jest": "^24.9.0",
  },
  "jest": {
    "moduleFileExtensions": ["js"]
  }
}

You can now write a test. A good place to start is checking that the queue’s retention period is 2 weeks. The simplest kind of test you can write is a snapshot test, so start with that. Put the following in a file named test/dead-letter-queue.test.ts:

import { SynthUtils } from '@aws-cdk/assert';
import { Stack } from '@aws-cdk/core';

import dlq = require('../lib/dead-letter-queue');

test('dlq creates an alarm', () => {
  const stack = new Stack();
  new dlq.DeadLetterQueue(stack, 'DLQ');
  expect(SynthUtils.toCloudFormation(stack)).toMatchSnapshot();
});

You can now compile and run the test:

$ npm run build
$ npm test

Jest will run your test and tell you that it has recorded a snapshot from your test.

PASS  test/dead-letter-queue.test.js
 ✓ dlq creates an alarm (55ms)
 › 1 snapshot written.
Snapshot Summary
› 1 snapshot written

The snapshots are stored in a directory called __snapshots__. If you look at the snapshot, you’ll see it just contains a copy of the CloudFormation template that our stack would generate:

exports[`dlq creates an alarm 1`] = `
Object {
  "Resources": Object {
    "DLQ581697C4": Object {
      "Type": "AWS::SQS::Queue",
    },
    "DLQAlarm008FBE3A": Object {
     "Properties": Object {
        "AlarmDescription": "There are messages in the Dead Letter Queue",
        "ComparisonOperator": "GreaterThanOrEqualToThreshold",
...

Congratulations! You’ve written and run your first test. Don’t forget to commit the snapshots directory to version control so that the snapshot gets stored and versioned with your code.

Using the snapshot

To make sure the test is working, you’re going to break it to make sure the breakage is detected. To do this, in your dead-letter-queue.ts file, change the cloudwatch.Alarm period to 1 minute (instead of the default of 5 minutes), by adding a period argument:

this.messagesInQueueAlarm = new cloudwatch.Alarm(this, 'Alarm', {
  // ...
  period: Duration.minutes(1),
});

If you now build and run the test again, Jest will tell you that the template changed:

$ npm run build && npm test

FAIL test/dead-letter-queue.test.js
✕ dlq creates an alarm (58ms)

● dlq creates an alarm

expect(received).toMatchSnapshot()

Snapshot name: `dlq creates an alarm 1`

- Snapshot
+ Received

@@ -19,11 +19,11 @@
               },
             ],
             "EvaluationPeriods": 1,
             "MetricName": "ApproximateNumberOfMessagesVisible",
             "Namespace": "AWS/SQS",
     -       "Period": 300,
     +       "Period": 60,
             "Statistic": "Maximum",
             "Threshold": 1,
           },
           "Type": "AWS::CloudWatch::Alarm",
         },

 › 1 snapshot failed.
Snapshot Summary
 › 1 snapshot failed from 1 test suite. Inspect your code changes or run `npm test -- -u` to update them.

Jest is telling you that the change you just made changed the emitted Period attribute from 300 to 60. You now have the choice of undoing our code change if this result was accidental, or committing to the new snapshot if you intended to make this change. To commit to the new snapshot, run:

npm test -- -u

Jest will tell you that it updated the snapshot. You’ve now locked in the new alarm period:

PASS  test/dead-letter-queue.test.js
 ✓ dlq creates an alarm (51ms)

 › 1 snapshot updated.
Snapshot Summary
 › 1 snapshot updated

Dealing with change

Let’s return to the DeadLetterQueue construct. Messages go to the dead letter queue when something is wrong with the primary queue processor, and you are notified via an alarm. After you fix the problem with the queue processor, you’ll usually want to redrive the messages from the dead letter queue, back to the primary queue, to have them processed as usual.

Messages only exist in a queue for a limited time though. To give yourself the greatest chance of recovering the messages from the dead letter queue, set the lifetime of messages in the dead letter queue (called the retention period) to the maximum time of 2 weeks. You make the following changes to your DeadLetterQueue construct:

export class DeadLetterQueue extends sqs.Queue {
  constructor(parent: Construct, id: string) {
    super(parent, id, {
      // Maximum retention period
      retentionPeriod: Duration.days(14)
    });
    // ...
  }
}

Now run the tests again:

$ npm run build && npm test
FAIL test/dead-letter-queue.test.js
✕ dlq creates an alarm (79ms)

    ● dlq creates an alarm

    expect(received).toMatchSnapshot()

    Snapshot name: `dlq creates an alarm 1`

    - Snapshot
    + Received

    @@ -1,8 +1,11 @@
      Object {
        "Resources": Object 
          "DLQ581697C4": Object {
    +       "Properties": Object {
    +         "MessageRetentionPeriod": 1209600,
    +       },
            "Type": "AWS::SQS::Queue",
         },
         "DLQAlarm008FBE3A": Object {
           "Properties": Object {
             "AlarmDescription": "There are messages in the Dead Letter Queue",

  › 1 snapshot failed.
Snapshot Summary
  › 1 snapshot failed from 1 test suite. Inspect your code changes or run `npm test -- -u` to update them.

The snapshot test broke again, because you added a retention period property. Even though the test was only intended to make sure that the DeadLetterQueue construct created an alarm, it was inadvertently also testing that the queue was created with default options.

Writing fine-grained assertions on resources

Snapshot tests are convenient to write and have their place for detecting accidental change. We use them in the CDK for our integration tests when validating larger bits of functionality all together. If a change causes an integration test’s template to deviate from its snapshot, we use that as a trigger to tell us we need to do extra validation, for example actually deploying the template through AWS CloudFormation and verifying our infrastructure still works.

In the CDK’s extensive suite of unit tests, we don’t want to revisit all the tests any time we make a change. To avoid this, we use the custom assertions in the @aws-cdk/assert/jest module to write fine-grained tests that verify only part of the construct’s behavior at a time, i.e. only the part we’re interested in for that particular test. For example, the test called “dlq creates an alarm” should assert that an alarm gets created with the appropriate metric, and it should not make any assertions on the properties of the queue that gets created as part of that test.

To write this test, you will have a look at the AWS::CloudWatch::Alarm resource specification in CloudFormation, and see what properties and values you’re using the assertion library to guarantee. In this case, you’re interested in the properties Namespace, MetricName and Dimensions. You can use the expect(stack).toHaveResource(...) assertion to make sure those have the values you want. To get access to that assertion, you’ll first need to import @aws-cdk/assert/jest, which extends the assertions that are available when you type expect(…). Putting this all together, your test should look like this:

import '@aws-cdk/assert/jest';

// ...
test('dlq creates an alarm', () =>; {
  const stack = new Stack();

  new dlq.DeadLetterQueue(stack, 'DLQ');

  expect(stack).toHaveResource('AWS::CloudWatch::Alarm', {
    MetricName: "ApproximateNumberOfMessagesVisible",
    Namespace: "AWS/SQS",
    Dimensions: [
      {
        Name: "QueueName",
        Value: { "Fn::GetAtt": [ "DLQ581697C4", "QueueName" ] }
      }
    ],
  });
});

This test asserts that an Alarm is created on the ApproximateNumberOfMessagesVisible metric of the dead letter queue (by means of the { Fn::GetAtt } intrinsic). If you run Jest now, it will warn you about an existing snapshot that your test no longer uses, so get rid of it by running npm test -- -u.

You can now add a second test for the retention period:

test('dlq has maximum retention period', () => {
  const stack = new Stack();

  new dlq.DeadLetterQueue(stack, 'DLQ');

  expect(stack).toHaveResource('AWS::SQS::Queue', {
    MessageRetentionPeriod: 1209600
  });
});

Run the tests to make sure everything passes:

$ npm run build && npm test
 
PASS  test/dead-letter-queue.test.js
  ✓ dlq creates an alarm (48ms)
  ✓ dlq has maximum retention period (15ms)

Test Suites: 1 passed, 1 total
Tests:       2 passed, 2 total

It does!

Validating construct configuration

Maybe you want to make the retention period configurable, while validating that the user-provided value falls into an acceptable range. You’d create a Props interface for the construct and add a check on the allowed values that your construct will accept:

export interface DeadLetterQueueProps {
    /**
     * The amount of days messages will live in the dead letter queue
     *
     * Cannot exceed 14 days.
     *
     * @default 14
     */
    retentionDays?: number;
}

export class DeadLetterQueue extends sqs.Queue {
  public readonly messagesInQueueAlarm: cloudwatch.IAlarm;

  constructor(scope: Construct, id: string, props: DeadLetterQueueProps = {}) {
    if (props.retentionDays !== undefined && props.retentionDays > 14) {
      throw new Error('retentionDays may not exceed 14 days');
    }

    super(scope, id, {
        // Given retention period or maximum
        retentionPeriod: Duration.days(props.retentionDays || 14)
    });
    // ...
  }
}

To test that your new feature actually does what you expect, you’ll write two tests:

  • One that checks a configured value ends up in the template; and
  • One which supplies an incorrect value to the construct and checks that you get the error you’re expecting.
test('retention period can be configured', () => {
  const stack = new Stack();

  new dlq.DeadLetterQueue(stack, 'DLQ', {
    retentionDays: 7
  });

  expect(stack).toHaveResource('AWS::SQS::Queue', {
    MessageRetentionPeriod: 604800
  });
});

test('configurable retention period cannot exceed 14 days', () => {
  const stack = new Stack();

  expect(() => {
    new dlq.DeadLetterQueue(stack, 'DLQ', {
      retentionDays: 15
    });
  }).toThrowError(/retentionDays may not exceed 14 days/);
});

Run the tests to confirm:

$ npm run build && npm test

PASS  test/dead-letter-queue.test.js
  ✓ dlq creates an alarm (62ms)
  ✓ dlq has maximum retention period (14ms)
  ✓ retention period can be configured (18ms)
  ✓ configurable retention period cannot exceed 14 days (1ms)

Test Suites: 1 passed, 1 total
Tests:       4 passed, 4 total

You’ve confirmed that your feature works, and that you’re correctly validating the user’s input.

As a bonus: you know from your previous tests still passing that you didn’t change any of the behavior when the user does not specify any arguments, which is great news!

Conclusion

You’ve written a reusable construct, and covered its features with resource assertion and validation tests. Regardless of whether you’re planning on writing tests on your own infrastructure application, on your own reusable constructs, or whether you’re planning to contribute to the CDK on GitHub, I hope this blog post has given you some mental tools for thinking about testing your infrastructure code.

Finally, two values I’d like to instill in you when you are writing tests:

  • Treat test code like you would treat application code. Test code is going to have an equally long lifetime in your code as regular code, and is equally subject to change. Don’t copy/paste setup lines or common assertions all over the place, take some extra time to factor out commonalities into helper functions. Your future self will thank you.
  • Don’t assert too much in one test. Preferably, a test should test one and only one behavior. If you accidentally break that behavior, you would prefer exactly one test to fail, and the test name will tell you exactly what you broke. There’s nothing worse than changing something trivial and having dozens of tests fail and need to be updated because they were accidentally asserting some behavior other than what the test was for. This does mean that—regardless of how convenient they are—you should be using snapshot tests sparingly, as all snapshot tests are going to fail if literally anything about the construct behavior changes, and you’re going to have to go back and scrutinize all failures to make sure nothing accidentally slipped by.

Happy testing!

from AWS Developer Blog https://aws.amazon.com/blogs/developer/testing-infrastructure-with-the-aws-cloud-development-kit-cdk/

Visualizing big data with AWS AppSync, Amazon Athena, and AWS Amplify

Visualizing big data with AWS AppSync, Amazon Athena, and AWS Amplify

This article was written by Brice Pelle, Principal Technical Account Manager, AWS

 

Organizations use big data and analytics to extract actionable information from untapped datasets. It can be difficult for you to build an application with access to this trove of data. You want to build great applications quickly and need access to tools that allow you to interact with the data easily.

Presenting data is just as challenging. Tables of numbers and keywords can fail to convey the intended message and make it difficult to communicate insightful observations. Charts, graphs, and images tend to be better at conveying complex ideas and patterns.

This post demonstrates how to use Amazon Athena, AWS AppSync, and AWS Amplify to build an application that interacts with big data. The application is built using React, the AWS Amplify Javascript library, and the D3.js Javascript library to render custom visualizations.

The application code can be found in this GitHub repository. It uses Athena to query data hosted in a public Amazon S3 bucket by the Registry of Open Data on AWS. Specifically, it uses the High Resolution Population Density Maps + Demographic Estimates by CIESIN and Facebook.

This public dataset provides “population data for a selection of countries, allocated to 1 arcsecond blocks and provided in a combination of CSV and Cloud-optimized GeoTIFF files,” and is hosted in the S3 bucket s3://dataforgood-fb-data.

Architecture overview

The Amplify CLI sets up sign-in/sign-up with Amazon Cognito, stands up a GraphQL API for GraphQL operations, and provisions content storage on S3 (the result bucket).

The Amplify CLI Storage Trigger feature provisions an AWS Lambda function (the announcer function) to respond to events in the result bucket. With the CLI, the announcer Lambda function’s permissions are set to allow GraphQL operations on the GraphQL API.

The Amplify CLI supports defining custom resources associated with the GraphQL API using the CustomResources.json AWS CloudFormation template located in the folder amplify/backend/api/YOUR-API-NAME/stacks/ of an Amplify Project. You can use this capability to define via CloudFormation an HTTP data source and AppSync resolvers to interface with Athena, and a None data source and local resolvers to trigger subscriptions in response to mutations from the announcer Lambda function.

Setting up multi-auth on the GraphQL API

AWS AppSync supports multiple modes of authorization that can be used simultaneously to interact with the API. This application’s GraphQL API is configured with the Amazon Cognito User Pool as its default authorization mode.

Users must authenticate with the User Pool before sending GraphQL operations. Upon sign-in, the user receives a JSON Web Token (JWT) that is attached to requests in an authorization header when sending GraphQL operations.

IAM Authorization is another available mode of authorization. The GraphQL API is configured with IAM as an additional authorization mode to recognize and authorize SigV4-signed requests from the announcer Lambda function. The configuration is done using a custom resource backed by a Lambda function. The custom resource is is defined in the CloudFormation template with the AppSyncApiId as a property. When deployed, it uses the UpdateGraphqlApi action to add the additional authorization mode to the API:

"MultiAuthGraphQLAPI": {
  "Type": "Custom::MultiAuthGraphQLAPIResource",
  "Properties": {
    "ServiceToken": { "Fn::GetAtt": ["MultiAuthGraphQLAPILambda", "Arn"] },
    "AppSyncApiId": { "Ref": "AppSyncApiId" }
  },
  "DependsOn": "MultiAuthGraphQLAPILambda"
}

The GraphQL schema must specify which types and fields are supported by the authorization modes (with Amazon Cognito User Pool being the default). The schema is configured with the needed authorization directives:

  •  @aws_iam to specify if a field or type is IAM authorized.
  • @aws_cognito_user_pools to specify if a field or type is Amazon Cognito User Pool authorized.

The announcer Lambda function needs access to the announceQueryResult mutation and the types included in the response. The AthenaQueryResult type is returned by the startQuery query (called from the app), and by announceQueryResult. The type must support both authorization modes.

type AthenaQueryResult @aws_cognito_user_pools @aws_iam {
    QueryExecutionId: ID!
    file: S3Object
}
type S3Object @aws_iam {
    bucket: String!
    region: String!
    key: String!
}
type Query {
    startQuery(input: QueryInput): AthenaQueryResult
}
type Mutation {
    announceQueryResult(input: AnnounceInput!):
      AthenaQueryResult @aws_iam
}

Setting up a NONE data source (Local Resolver) to enable subscriptions

The announcer Lambda function is triggered in response to S3 events and sends a GraphQL mutation to the GraphQL API. The mutation in turn triggers a subscription and sends the mutation selection set to the subscribed app.

The mutation data does not need to be saved. AWS AppSync only needs to forward the results to the application using the triggered subscription. To enable this, a NONE data source is configured and associated with the local resolver announceQueryResult. NONE data sources and local resolvers are very useful to allow publishing real-time subscriptions without triggering a data source call to modify or update data.

"DataSourceNone": {
  "Type": "AWS::AppSync::DataSource",
  "Properties": {
    "ApiId": { "Ref": "AppSyncApiId" },
    "Name": "None",
    "Description": "None",
    "Type": "NONE"
  }
},
"AnnounceQueryResultResolver": {
  "Type": "AWS::AppSync::Resolver",
  "Properties": {
    "ApiId": {"Ref": "AppSyncApiId"},
    "DataSourceName": { "Fn::GetAtt": ["DataSourceNone", "Name"] },
    "TypeName": "Mutation",
    "FieldName": "announceQueryResult",
  }
}

In the schema, the onAnnoucement subscription is associated with the mutation.

type Mutation {
    announceQueryResult(input: AnnounceInput!):
      AthenaQueryResult @aws_iam
}
type Subscription {
    onAnnouncement(QueryExecutionId: ID!): 
      AthenaQueryResult
        @aws_subscribe(mutations: ["announceQueryResult"])
}

Setting up Athena as a data source

AWS AppSync supports HTTP data sources and can be configured to interact securely with AWS service endpoints.

To configure Athena as a data source, the CustomResources.json template defines the role that AWS AppSync assumes to interact with the API: AppSyncAthenaRole.

The role is assigned the managed policy AmazonAthenaFullAccess. The policy provides read and write permissions to S3 buckets with names starting with aws-athena-query-results-. The application uses this format to name the S3 bucket in which the Athena query results are stored. It assigns the AmazonS3ReadOnlyAccess policy to allow Athena to read from the source data bucket.

The resource DataSourceAthenaAPI defines the data source and specifies IAM as the authorization type along with the service role to be used.

"AppSyncAthenaRole": {
  "Type": "AWS::IAM::Role",
  "Properties": {
    "RoleName": {
      "Fn::Join": [
        "-",
        ["appSyncAthenaRole", { "Ref": "AppSyncApiId" }, { "Ref": "env" }]
      ]
    },
    "ManagedPolicyArns": [
      "arn:aws:iam::aws:policy/AmazonAthenaFullAccess",
      "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
    ],
    "AssumeRolePolicyDocument": {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": { "Service": ["appsync.amazonaws.com"] },
          "Action": ["sts:AssumeRole"]
        }
      ]
    }
  }
},
"DataSourceAthenaAPI": {
  "Type": "AWS::AppSync::DataSource",
  "Properties": {
    "ApiId": { "Ref": "AppSyncApiId" },
    "Name": "AthenaAPI",
    "Description": "Athena API",
    "Type": "HTTP",
    "ServiceRoleArn": { "Fn::GetAtt": ["AppSyncAthenaRole", "Arn"] },
    "HttpConfig": {
      "Endpoint": {
        "Fn::Join": [
          ".",
          ["https://athena", { "Ref": "AWS::Region" }, "amazonaws.com/"]
        ]
      },
      "AuthorizationConfig": {
        "AuthorizationType": "AWS_IAM",
        "AwsIamConfig": {
          "SigningRegion": { "Ref": "AWS::Region" },
          "SigningServiceName": "athena"
        }
      }
    }
  }
}

Application Overview

A walkthrough and guide to setting up the application, configuring the subscription, and visualization follows.

Walk-through

Here is how the application works:

  1. Users sign in to the app using Amazon Cognito User Pools. The JWT access token returned at sign-in is sent in an authorization header to AWS AppSync with every GraphQL operation.
  2. A user selects a country from the drop-down list and chooses Query. This triggers a GraphQL query. When the app receives the QueryExecutionId in the response, it subscribes to mutations on that ID.
  3. AWS AppSync makes a SigV4-signed request to the Athena API with the specified query.
  4. Athena runs the query against the specified table. The query returns the sum of the population at recorded longitudes for the selected country along with a count of latitudes at each longitude.
    SELECT longitude, count(latitude) as count, sum(population) as tot_pop
      FROM "default"."hrsl"
      WHERE country='${countryCode.trim()}'
      group by longitude
      order by longitude
  5. The results of the query are stored in the result S3 bucket, under the /protected/athena/ prefix. Signed-in app users can access these results using their IAM credentials.
  6. Putting the query result file in the bucket generates an S3 event and triggers the announcer Lambda function.
  7. The announcer Lambda function sends an announceQueryResult mutation with the S3 bucket and object information.
  8. The mutation triggers a subscription with the mutation’s selection set.
  9. The client retrieves the result file from the S3 bucket and displays the custom visualization.

Setting up the application

The application is a React app that uses the Amplify Javascript library to interact with the Amplify-configured backend services. To get started, install the required libraries.

npm install aws-amplify aws-amplify-react

Then, in the main app file, import the necessary dependencies, including the ./aws-exports.js file containing the backend configuration information.

import React, { useEffect, useState, useCallback } from 'react'
import Amplify, { API, graphqlOperation, Storage } from 'aws-amplify'
import { withAuthenticator } from 'aws-amplify-react'
...
import awsconfig from './aws-exports'
...
Amplify.configure(awsconfig)

To get automatic sign-in, sign-up, and confirm functionality in the app, wrap the main component in the withAuthenticator higher-order component (HOC).

export default withAuthenticator(App, true)

Configuring the subscription

When a user chooses Query, it calls a startQuery callback, and sends a GraphQL query, which returns a QueryExecutionId and updates the queryExecutionId state variable.

const [isSending, setIsSending] = useState(false)
const [QueryExecutionId, setQueryExecutionId] = useState(null)

const startQuery = useCallback(async () => {
  if (isSending) return
  setIsSending(true)
  setFileKey(null)
  try {
    const result = await API.graphql(
      graphqlOperation(queries.startQuery, {
        input: { QueryString: sqlQuery(countryCode) }
      })
    )
    console.log(`Setting sub ID: ${result.data.startQuery.QueryExecutionId}`)
    setIsSending(false)
    setQueryExecutionId(result.data.startQuery.QueryExecutionId)
  } catch (error) {
    setIsSending(false)
    console.log('query failed ->', error)
  }
}, [countryCode, isSending])

Setting the state triggers the following useEffect hook, which creates the subscription. Any time that subscriptionId is changed (for example, set to null), it calls the useEffect return function, which unsubscribes the existing subscription.

const [countryCode, setCountryCode] = useState('')
const [fileKey, setFileKey] = useState(null)

useEffect(() => {
  if (!QueryExecutionId) return

  console.log(`Starting subscription with sub ID ${QueryExecutionId}`)
  const subscription = API.graphql(
    graphqlOperation(subscriptions.onAnnouncement, { QueryExecutionId })
  ).subscribe({
    next: result => {
      console.log('subscription:', result)
      const data = result.value.data.onAnnouncement
      console.log('subscription data:', data)
      setFileKey(data.file.key)
      setQueryExecutionId(null)
    }
  })

  return () => {
    console.log(`Unsubscribe with sub ID ${QueryExecutionId}`, subscription)
    subscription.unsubscribe()
  }
}, [QueryExecutionId])

Visualization

When triggered, the onAnnouncement subscription returns the following data specified in the mutation selection set. This tells the application where to fetch the result file. Signed-in users can read objects in the result bucket starting with the /protected/ prefix. Because Athena saves the results under the /protected/athena/ prefix, authenticated users can retrieve the result files.

QueryExecutionId
    file {
        bucket
        region
        key
}

The key value is passed to the fileKey props in a Visuals component. The application splits the key to extract the level (protected), the identity (Athena), and the object key (\*.csv). The Storage.get function generates a presigned URL with the current IAM credentials, used to retrieve the file with the d3.csv function.

The file is a CSV file with rows of longitude, count, and population. A callback maps the values to x and y (the graph coordinates), and a count property. The application uses the D3.js library along with the d3-hexbin plugin to create the visualization. The d3-hexbin plugin groups the data points in hexagonal-shaped bins based on a defined radius.

const [link, setLink] = useState(null)
useEffect(() => {
  const go = async () => {
    const [level, identityId, _key] = fileKey.split('/')
    const link = await Storage.get(_key, { level, identityId })
    setLink(link)

    const data = Object.assign(
      await d3.csv(link, ({ longitude, tot_pop, count }) => ({
        x: parseFloat(longitude),
        y: parseFloat(tot_pop),
        count: parseInt(count)
      })),
      { x: 'Longitude', y: 'Population', title: 'Pop bins by Longitude' }
    )
    drawChart(data)
  }
  go()
}, [fileKey])

Launching the application

Follow these steps to launch the application.

One-click launch

One Click Deploy to Amplify Console

You can deploy the application directly to the Amplify Console from the public GitHub repository. Both the backend infrastructure and the frontend application are built and deployed. After the application is deployed, follow the remaining steps to configure your Athena database.

Clone and launch

Alternatively, you can clone the repository, deploy the backend with Amplify CLI, and build and serve the frontend locally.

First, install the Amplify CLI and step through the configuration.

$ npm install -g @aws-amplify/cli
$ amplify configure

Next, clone the repository and install the dependencies.

$ git clone https://github.com/aws-samples/aws-appsync-visualization-with-athena-app
$ cd aws-appsync-visualization-with-athena-app
$ yarn

Update the name of the storage bucket (bucketName) in the file ./amplify/backend/storage/sQueryResults/parameters.json then initialize a new Amplify project and push the changes.

$ amplify init
$ amplify push

Finally, launch the application.

$ yarn start

Setting up Athena

The application uses data hosted in S3 by the Registry of Open Data on AWS. Specifically, you use the High Resolution Population Density Maps + Demographic Estimates by CIESIN and Facebook. You can find information on how to set up Athena to query this dataset in the Readme file.

Create a database named `default`.

create database IF NOT EXISTS default;

Create the table in the default database.

CREATE EXTERNAL TABLE IF NOT EXISTS default.hrsl (
  `latitude` double,
  `longitude` double,
  `population` double 
) PARTITIONED BY (
  month string,
  country string,
  type string 
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
  'serialization.format' = '\t',
  'field.delim' = '\t'
) LOCATION 's3://dataforgood-fb-data/csv/'
TBLPROPERTIES ('has_encrypted_data'='false', 'skip.header.line.count'='1');

Recover the partitions.

MSCK REPAIR TABLE hrsl;

When that completes, you should be able to preview the table and see the type of information shown in the following screenshot.

Next, create a new workbook. First, look up the name of your S3 content storage bucket. If you deployed using the one-click launch, search for aws_user_files_s3_bucket in the backend build activity log. If you deployed using the “Clone and Launch” steps, find aws_user_files_s3_bucket in your aws-exports.js file in the src directory. From the Athena console, choose Workgroup in the upper bar, then choose Create workgroup. Provide the workgroup name: appsync. Set Query result location to s3:///protected/athena/. Choose Create workgroup.

Conclusion

This post demonstrated how to use AWS AppSync to interact with the Amazon Athena API and securely render custom visualizations in your front-end application. By combining these services, you can easily create applications that interact directly with big data stored on S3, and render the data in different ways with graphs and charts.

Along with libraries from D3.js, you can develop new innovative ways to interact with data and display information to users. In addition, you can get started quickly, implement core functionality, and deploy instantly using the AWS Amplify Framework.

from AWS Mobile Blog