Category: What’s new

Say Hello to 41 New AWS Competency, MSP, and Service Delivery Partners Added in September

Say Hello to 41 New AWS Competency, MSP, and Service Delivery Partners Added in September

AWS Partner NetworkThe AWS Partner Network (APN) is the global partner program for Amazon Web Services (AWS). We enable APN Partners to build, market, and sell their AWS-based offerings, and we help customers identify top APN Partners that can deliver on core business objectives.

To receive APN program designations such as AWS Competency, AWS Managed Services Provider (MSP), and AWS Service Delivery, organizations must undergo rigorous technical validation and assessment of their AWS solutions and practices.

These designations help customers identify and choose specialized APN Partners that can provide value-added services and solutions. Guidance from these skilled professionals can leads to better business and bigger results.

Team Up with AWS Competency Partners

If you want to be successful in today’s complex IT environment, and remain that way tomorrow and into the future, teaming up with an AWS Competency Partner is The Next Smart.

The AWS Competency Program verifies, validates, and vets top APN Partners that have demonstrated customer success and deep specialization in specific solution areas or segments.

These APN Partners were recently awarded AWS Competency designations:

AWS Data & Analytics Competency

AWS DevOps Competency

AWS Government Competency

AWS Life Sciences Competency

AWS Migration Competency

AWS Oracle Competency

AWS Security Competency

AWS Storage Competency

Team Up with AWS Managed Service Providers

The AWS Managed Service Provider (MSP) Partner Program recognizes leading APN Consulting Partners that are highly skilled at providing full lifecycle solutions to customers.

Next-generation AWS MSPs can help enterprises invent tomorrow, solve business problems, and support initiatives by driving key outcomes. AWS MSPs provide the expertise, guidance, and services to help you through each stage the Cloud Adoption Journey: Plan & Design > Build & Migrate > Run & Operate > Optimize.

Explore 7 reasons why AWS MSPs are fundamental to your cloud journey >>

Meet our newest AWS Managed Service Providers (MSP):

Team Up with AWS Service Delivery Partners

The AWS Service Delivery Program identifies and endorses top APN Partners with a deep understanding of specific AWS services, such as AWS CloudFormation and Amazon Kinesis.

AWS Service Delivery Partners have proven success delivering AWS services to end customers. To receive this designation, APN Partners must undergo service-specific technical validation by AWS Partner Solutions Architects, and complete a customer case business review.

Introducing our newest AWS Service Delivery Partners:

Amazon API Gateway Partners

Amazon Aurora MySQL-Compatible Edition Partners

Amazon CloudFront Partners

AWS Database Migration Service Partners

AWS Direct Connect Partners

Amazon EC2 for Microsoft Windows Server Partners

AWS Lambda Partners

Amazon RDS Partners

AWS WAF Partners

Want to Differentiate Your Partner Business? APN Navigate Can Help.

If you’re already an APN Partner, enhance your Cloud Adoption Journey by leveraging APN Navigate for a prescriptive path to building a specialized practice on AWS.

APN Navigate tracks offer APN Partners the guidance to become AWS experts and deploy innovative solutions on behalf of end customers. Each track includes foundational and specialized e-learnings, advanced tools and resources, and clear calls to action for both business and technical tracks.

Learn how APN Navigate is a partner’s path to specialization >>

Learn More About the AWS Partner Network (APN)

APN Partners receive business, technical, sales, and marketing resources to help you grow your business and better support your customers.

See all the benefits of being an APN Partner >>

Find an APN Partner to Team Up With

APN Partners are focused on your success, helping customers take full advantage of the business benefits AWS has to offer. With their deep expertise on AWS, APN Partners are uniquely positioned to help your company.

Find an APN Partner that meets your needs >>

from AWS Partner Network (APN) Blog

Your Guide to APN Partner Sessions, Workshops, and Chalk Talks at AWS re:Invent 2019

Your Guide to APN Partner Sessions, Workshops, and Chalk Talks at AWS re:Invent 2019

Global Partner Summit 2019-1AWS re:Invent 2019 is almost here and reserved seating is now live!

To reserve seating for re:Invent activities throughout the week, including Global Partner Summit, log into the event catalog using your re:Invent registration credentials. Build out your event schedule by reserving a seat in available sessions.

Reserved seating is for breakout sessions, workshops, chalk talks, builder sessions, hacks, spotlight labs, and other activities. Keynotes, builders fairs, demo theater sessions, and hands-on labs are first come, first served and not included in reserved seating.

Reserve your seat today at AWS re:Invent activities >>

Global Partner Summit Seating

At re:Invent, members of the AWS Partner Network (APN) can learn how to leverage AWS technologies to better serve your customers, and discover how the APN can help you build, market, and sell your AWS offerings. This year, we have 76 sessions dedicated for existing and prospective APN Partners.

You can find all of the partner-related sessions in the re:Invent catalog by selecting “Partner” under the Topics filter on the left side of the page.

There are different type of sessions that fit your company’s needs. Here are some GPS sessions to keep an eye on!

Breakout Sessions

Breakouts are one-hour, lecture-style sessions delivered by AWS experts.

Business Breakouts

  • GPSBUS207 – Build Success with New APN Offerings
    Learn about new APN program launches and announcements made at the Global Partner Summit keynote at re:Invent. These new APN programs are designed to help you demonstrate deep AWS expertise to customers and achieve long-term success as an APN Partner.
  • GPSBUS203 – APN Technology Partner Journey: Winning with AWS for ISVs
    Hear from AWS experts and APN Partners about the steps of the APN Technology Partner journey, from onboarding to building, marketing, and selling. We share with you the markers for success along each path, programs to take advantage of, and how to accelerate your growth.

Technical Breakouts

  • GPSTEC337 – Architecting Multi-Tenant PaaS Offerings with Amazon EKS
    Learn the value proposition of architecting a multi-tenant platform-as-a-service (PaaS) offering on AWS, and the technical considerations for securing, scaling, and automating the provisioning of customer instances within Amazon Elastic Kubernetes Service (Amazon EKS).
  • GPSTEC338 – Building Data Lakes for Your Customers with SAP on AWS
    In this demo-driven session, we show the best practices and reference architectures for extracting data from SAP applications at scale. Get prescriptive guidance on how to design high-performance data extractors using services like AWS Glue and AWS Lambda.


Workshops are two-hour, hands-on sessions where you work in teams to solve problems using AWS services. Workshops organize attendees into small groups and provide scenarios to encourage interaction, giving you the opportunity to learn from and teach each other.

  • GPSTEC340 – How to Pass a Technical Baseline Review
    A Technical Baseline Review (TBR) is a prerequisite for achieving APN Advanced Tier status. In this workshop, learn why the review is important for the success of your product, how Partner Solutions Architects evaluate your architecture, and how to get prepared.
  • GPSTEC404 – Build an AI to Play Blackjack
    In this workshop, use computer vision and machine learning to build an AI to play blackjack. Build and train a neural network using Amazon SageMaker, and then train a reinforcement learning agent to make a decision that gives you the best chance to win.
GPS Terry Wise Global Partner Summit at AWS at the The Venetian, Las Vegas, NV on Tuesday, Nov. 27, 2018. AWS Mythical Mysfits at the Venetian, Las Vegas, NV on Tuesday, Nov. 27, 2018.

Chalk Talks

Chalk Talks are one-hour, highly interactive sessions with a small audience.

  • GPSTEC204 – Technical Power-Ups for AWS Consulting Partners
    Learn about AWS technical assets that can help you deliver successful cloud projects. Dive deep into AWS Immersion Days and Well-Architected Reviews, and leverage GameDays and Hackathons to propel customers along their cloud adoption journey.
  • GPSTEC303 – Overcoming the Challenges of Being a Next-Generation MSP
    Discuss how the AWS Managed Service (MSP) Partner Program guides and assists organizations in various stages of maturity to overcome the challenges of transitioning from being a traditional MSP to being a next-generation MSP on AWS.

Builder Sessions

Builder Sessions are one-hour, small group sessions with up to six customers and one AWS expert, who is there to help, answer questions, and provide guidance.

  • GPSTEC417-R – [REPEAT] Build a Custom Container with Amazon SageMaker
    Build a custom container that contains a train-completed PyTorch model, and deploy it as an Amazon SageMaker endpoint. A PyTorch/fast-ai model is provided for learning purposes.
  • GPSTEC418-R – [REPEAT] Securing Your .NET Container Secrets
    Many customers moving .NET workloads to the cloud containerize applications for agility and cost savings. In this session, learn how to safely containerize an ASP.NET Core application while leveraging services like AWS Secrets Manager and AWS Fargate.

Learn More About Global Partner Summit

Join us for the Global Partner Summit at re:Invent 2019, which provides APN Partners with opportunities to connect, collaborate, and discover.

Learn how to leverage AWS technologies to serve your customers, and discover how the AWS Partner Network (APN) can help you build, market, and sell your AWS-based business. You’ll have plenty of opportunities to connect with AWS field teams and other APN Partners.

This year, Global Partner Summit sessions will take place throughout the week across the entire re:Invent campus.

Learn more about Global Partner Summit >>

Why Your Company Should Sponsor AWS re:Invent 2019

Is your company joining the up to 65,000 expected attendees at AWS re:Invent 2019? Enhance your conference experience and drive lead generation through sponsorship—an exclusive opportunity for APN Partners and select AWS enterprise customers.

With plenty of turnkey options still available, it’s not too late to participate in the leading global customer and partner conference for the cloud computing community.

Learn more about sponsorship and get started today >>

from AWS Partner Network (APN) Blog

AWS IoT Things Graph now provides workflow monitoring with AWS CloudWatch

AWS IoT Things Graph now provides workflow monitoring with AWS CloudWatch

You can now monitor your AWS IoT Things Graph workflows using AWS CloudWatch metrics. You can collect metrics for workflow steps that are executed by AWS IoT Things Graph, including success count, failure count, and total count and then set alarm thresholds for each of these metrics within AWS CloudWatch.  For example, you can set alarms that watch for the number of flows that have failed, and send notifications to a downstream application or to an operator.

from Recent Announcements

Amazon SNS Now Supports Additional Mobile Push Notification Headers as Message Attributes

Amazon SNS Now Supports Additional Mobile Push Notification Headers as Message Attributes

Amazon Simple Notification Service (SNS) now supports additional mobile push notification headers from Amazon Device Messaging (ADM), Apple Push Notification service (APNs), Baidu Cloud Push, Firebase Cloud Messaging (FCM), Microsoft Push Notification Service (MPNS), and Windows Push Notification Services (WNS). The additional reserved message attributes provide you with more configuration options when structuring your push notification messages. 

from Recent Announcements

Authenticate applications through facial recognition with Amazon Cognito and Amazon Rekognition

Authenticate applications through facial recognition with Amazon Cognito and Amazon Rekognition

With increased use of different applications, social networks, financial platforms, emails and cloud storage solutions, managing different passwords and credentials can become a burden. In many cases, sharing one password across all these applications and platforms is just not possible. Different security standards may be required, such as passwords composed by only numeric characters, password renewal policies, and providing security questions.

But what if you could enhance the ways users authenticate themselves in their application in a more convenient, simpler and above everything, more secure way? In this post, I will show how to leverage Amazon Cognito user pools to customize your authentication flows and allow logging into your applications using Amazon Rekognition for facial recognition using a sample application.

Solution Overview

We will build a Mobile or Web application that allows users to sign in using an email and require the user to upload a document containing his or her photo. We will use the AWS Amplify Framework to integrate our Front-End application with Amazon S3 and store this image in a secure and encrypted bucket.  Our solution will trigger a Lambda function for each new image uploaded to this bucket so that we can index the images inside Amazon Rekognition and save the metadata in a DynamoDB table for later queries.

For authentication, this solution uses Amazon Cognito User Pools combined with Lambda functions to customize the authentication flows together with the Amazon Rekognition CompareFaces API to identify the confidence level between user photos provided during Sign Up and Sign In. Here is the architecture of the solution:

Here’s a step-wise description of the above data-flow architecture diagram:

  1. User signs up into the Cognito User Pool.
  2. User uploads – during Sign Up – a document image containing his/her photo and name, to an S3 Bucket (e.g. Passport).
  3. A Lambda function is triggered containing the uploaded image as payload.
  4. The function first indexes the image in a specific Amazon Rekognition Collection to store these user documents.
  5. The same function then persists in a DynamoDB table as the indexed image metadata, together with the email registered in Amazon Cognito User Pool for later queries.
  6. User enters an email in the custom Sign In page, which makes a request to Cognito User Pool.
  7. Amazon Cognito User Pool triggers the “Define Auth Challenge” trigger that determines which custom challenges are to be created at this moment.
  8. The User Pool then invokes the “Create Auth Challenge” trigger. This trigger queries the DynamoDB table for the user containing the given email id to retrieve its indexed photo from the Amazon Rekognition Collection.
  9. The User Pool invokes the “Verify Auth Challenge” trigger. This verifies if the challenge was indeed successfully completed; if it finds an image, it will compare it with the photo taken during Sign In to measure its confidence between both the images.
  10. The User Pool, once again, invokes the “Define Auth Challenge” trigger that verifies if the challenge was answered. No no further challenges are created, if the ‘Define Auth challenge’ is able to verify the user-supplied answer. The trigger response, back to the User Poll will include an “issueTokens:true” attribute to authenticate itself and finally issue the user a JSON Web Token (JWT) (see step 6).

Serverless Application and the different Lambdas invoked

The following solution is available as a Serverless application. You can deploy it directly from AWS Serverless Application Repository. Core parts of this implementation are:

  • Users are required to use a valid email as user name.
  • The solution includes a Cognito App Client configured to “Only allow custom authentication” as Amazon Cognito requires a password for user sign up. We are creating a random password to these users, since we don’t want them to Sign In using these passwords later.
  • We use two Amazon S3 Buckets: one to store document images uploaded during Sign Up and one to store user photos taken when Signing In for face comparisons.
  • We use two different Lambda runtimes (Python and Node.js) to demonstrate how AWS Serverless Application Model (SAM) handles multiple runtimes in the same project and development environment for the developer’s perspective.

The following Lambda functions are triggered to implement the images indexing in Amazon Rekognition and customize Amazon Cognito User Pools custom authentication challenges:

  1. Create Rekognition Collection (Python 3.6) – This Lambda function gets triggered only once, at the beginning of deployment, to create a Custom Collection in Amazon Rekognition to index documents for user Sign Ups.
  2. Index Images (Python 3.6) – This Lambda function gets triggered for each new document upload to Amazon S3 during Sign Up and indexes the uploaded document in the Amazon Rekognition Collection (mentioned in the previous step) and then persists its metadata into DynamoDB.
  3. Define Auth Challenge (Node.js 8.10) – This Lambda function tracks the custom authentication flow, which is comparable to a decider function in a state machine. It determines which challenges are presented, in what order, to the user. At the end, it reports back to the user pool if the user succeeded or failed authentication. The Lambda function is invoked at the start of the custom authentication flow and also after each completion of the “Verify Auth Challenge Response” trigger.
  4. Create Auth Challenge (Node.js 8.10) – This Lambda function gets invoked, based on the instruction of the “Define Auth Challenge” trigger, to create a unique challenge for the user. We will use this function to query DynamoDB for existing user records and if their given metadata are valid.
  5. Verify Auth Challenge Response (Node.js 8.10) – This Lambda function gets invoked by the user pool when the user provides the answer to the challenge. Its only job is to determine if that answer is correct. In this case, it compares both images provided during Sign Up and Sign In, using the Amazon Rekognition CompareFaces API and considers a API responses containing a confidence level equals or greater than 90% as a valid challenge response.

In the sections below, let’s step through the code for the different Lambda functions we described above.

1. Create an Amazon Rekognition Collection

As described above, this function creates a Collection in Amazon Rekognition that will later receive user photos uploaded during Sign Up.

import boto3
import os

def handler(event, context):


    #Create a collection
    print('Creating collection:' + collectionId)
    print('Collection ARN: ' + response['CollectionArn'])
    print('Status code: ' + str(response['StatusCode']))
    return response

2. Index Images into Amazon Rekognition

This function is responsible for receiving uploaded images during the sign up from users and index the images in the Amazon Rekognition Collection created by the Lambda described above to persist its metadata in an Amazon Dynamodb table.

from __future__ import print_function
import boto3
from decimal import Decimal
import json
import urllib
import os

dynamodb = boto3.client('dynamodb')
s3 = boto3.client('s3')
rekognition = boto3.client('rekognition')

# --------------- Helper Functions ------------------

def index_faces(bucket, key):

    response = rekognition.index_faces(
            {"Bucket": bucket,
            "Name": key}},
    return response
def update_index(tableName,faceId, fullName):
    response = dynamodb.put_item(
            'RekognitionId': {'S': faceId},
            'FullName': {'S': fullName}
# --------------- Main handler ------------------

def handler(event, context):

    # Get the object from the event
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(


        # Calls Amazon Rekognition IndexFaces API to detect faces in S3 object 
        # to index faces into specified collection
        response = index_faces(bucket, key)
        # Commit faceId and full name object metadata to DynamoDB
        if response['ResponseMetadata']['HTTPStatusCode'] == 200:
            faceId = response['FaceRecords'][0]['Face']['FaceId']
            ret = s3.head_object(Bucket=bucket,Key=key)
            email = ret['Metadata']['email']
            update_index(os.environ['COLLECTION_NAME'],faceId, email) 
        return response
    except Exception as e:
        print("Error processing object {} from bucket {}. ".format(key, bucket))
        raise e

3. Define Auth Challenge Function

This is the decider function that manages the authentication flow. In the session array that’s provided to this Lambda function (event.request.session), the entire state of the authentication flow is present. If it’s empty, it means the custom authentication flow just started. If it has items, the custom authentication flow is underway, i.e. a challenge was presented to the user, the user provided an answer, and it was verified to be right or wrong. In either case, the decider function has to decide what to do next:

exports.handler = async (event, context) => {

    console.log("Define Auth Challenge: " + JSON.stringify(event));

    if (event.request.session &&
        event.request.session.length >= 3 &&
        event.request.session.slice(-1)[0].challengeResult === false) {
        // The user provided a wrong answer 3 times; fail auth
        event.response.issueTokens = false;
        event.response.failAuthentication = true;
    } else if (event.request.session &&
        event.request.session.length &&
        event.request.session.slice(-1)[0].challengeResult === true) {
        // The user provided the right answer; succeed auth
        event.response.issueTokens = true;
        event.response.failAuthentication = false;
    } else {
        // The user did not provide a correct answer yet; present challenge
        event.response.issueTokens = false;
        event.response.failAuthentication = false;
        event.response.challengeName = 'CUSTOM_CHALLENGE';

    return event;

4. Create Auth Challenge Function

This function queries DynamoDB for a record containing the given e-mail to retrieve its image ID inside Amazon Rekognition Collection and define as a challenge that the user must provide a photo that relates to the same person.

const aws = require('aws-sdk');
const dynamodb = new aws.DynamoDB.DocumentClient();

exports.handler = async (event, context) => {

    console.log("Create auth challenge: " + JSON.stringify(event));

    if (event.request.challengeName == 'CUSTOM_CHALLENGE') {
        event.response.publicChallengeParameters = {};

        let answer = '';
        // Querying for Rekognition ids for the e-mail provided
        const params = {
            TableName: process.env.COLLECTION_NAME,
            IndexName: "FullName-index",
            ProjectionExpression: "RekognitionId",
            KeyConditionExpression: "FullName = :userId",
            ExpressionAttributeValues: {
        try {
            const data = await dynamodb.query(params).promise();
            data.Items.forEach(function (item) {
                answer = item.RekognitionId;

                event.response.publicChallengeParameters.captchaUrl = answer;
                event.response.privateChallengeParameters = {};
                event.response.privateChallengeParameters.answer = answer;
                event.response.challengeMetadata = 'REKOGNITION_CHALLENGE';
                console.log("Create Challenge Output: " + JSON.stringify(event));
                return event;
        } catch (err) {
            console.error("Unable to query. Error:", JSON.stringify(err, null, 2));
            throw err;
    return event;

5. Verify Auth Challenge Response Function

This function verifies within Amazon Rekognition if it can find an image with the confidence level equals or over 90% compared to the image uploaded during Sign In, and if this image refers to the user claims to be through the given e-mail address.

var aws = require('aws-sdk');
var rekognition = new aws.Rekognition();

exports.handler = async (event, context) => {

    console.log("Verify Auth Challenge: " + JSON.stringify(event));
    let userPhoto = '';
    event.response.answerCorrect = false;

    // Searching existing faces indexed on Rekognition using the provided photo on s3

    const objectName = event.request.challengeAnswer;
    const params = {
        "CollectionId": process.env.COLLECTION_NAME,
        "Image": {
            "S3Object": {
                "Bucket": process.env.BUCKET_SIGN_UP,
                "Name": objectName
        "MaxFaces": 1,
        "FaceMatchThreshold": 90
    try {
        const data = await rekognition.searchFacesByImage(params).promise();

        // Evaluates if Rekognition was able to find a match with the required 
        // confidence threshold

        if (data.FaceMatches[0]) {
            console.log('Face Id: ' + data.FaceMatches[0].Face.FaceId);
            console.log('Similarity: ' + data.FaceMatches[0].Similarity);
            userPhoto = data.FaceMatches[0].Face.FaceId;
            if (userPhoto) {
                if (event.request.privateChallengeParameters.answer == userPhoto) {
                    event.response.answerCorrect = true;
    } catch (err) {
        console.error("Unable to query. Error:", JSON.stringify(err, null, 2));
        throw err;
    return event;

The Front End Application

Now that we’ve stepped through all the Lambdas, let’s create a custom Sign In page, in order to orchestrate and test our scenario. You can use AWS Amplify Framework to integrate your Sign In page to Amazon Cognito and the photo uploads to Amazon S3.

The AWS Amplify Framework allows you to implement your application using your favourite framework (React, Angular, Vue, HTML/JavaScript, etc). You can customize the snippets below as per your requirements. The snippets below demonstrate how to import and initialize AWS Amplify Framework on React:

import Amplify from 'aws-amplify';

  Auth: {
    region: 'your region',
    userPoolId: 'your userPoolId',
    userPoolWebClientId: 'your clientId',
  Storage: { 
    region: 'your region', 
    bucket: 'your sign up bucket'

Signing Up

For users to be able to sign themselves up, as mentioned above, we will “generate” a random password on their behalf since it is required by Amazon Cognito for user sign up. However, once we create our Cognito User Pool Client, we ensure that authentication only happens following the custom authentication flow – never using user and password.

import { Auth } from 'aws-amplify';

signUp = async event => {
  const params = {
    password: getRandomString(30),
    attributes: {
      name: this.state.fullName
  await Auth.signUp(params);

function getRandomString(bytes) {
  const randomValues = new Uint8Array(bytes);
  return Array.from(randomValues).map(intToHex).join('');

function intToHex(nr) {
  return nr.toString(16).padStart(2, '0');

Signing in

Starts the custom authentication flow to the user.

import { Auth } from "aws-amplify";

signIn = () => { 
    try { 
        user = await Auth.signIn(;
        this.setState({ user });
    } catch (e) { 

Answering the Custom Challenge

In this step, we open the camera through Browser to take a user photo and then upload it to Amazon S3, so we can start the face comparison.

import Webcam from "react-webcam";

// Instantiate and set webcam to open and take a screenshot
// when user is presented with a custom challenge

/* Webcam implementation goes here */

// Retrieves file uploaded to S3 and sends as a File to Rekognition 
// as answer for the custom challenge
dataURLtoFile = (dataurl, filename) => {
  var arr = dataurl.split(','), mime = arr[0].match(/:(.*?);/)[1],
      bstr = atob(arr[1]), n = bstr.length, u8arr = new Uint8Array(n);
      u8arr[n] = bstr.charCodeAt(n);
  return new File([u8arr], filename, {type:mime});

sendChallengeAnswer = () => {

    // Capture image from user camera and send it to S3
    const imageSrc =;
    const attachment = await s3UploadPub(dataURLtoFile(imageSrc, "id.png"));
    // Send the answer to the User Pool
    const answer = `public/${attachment}`;
    user = await Auth.sendCustomChallengeAnswer(cognitoUser, answer);
    this.setState({ user });
    try {
        // This will throw an error if the user is not yet authenticated:
        await Auth.currentSession();
    } catch {
        console.log('Apparently the user did not enter the right code');


In this blog post, we implemented an authentication mechanism using facial recognition using the custom authentication flows provided by Amazon Cognito combined with Amazon Rekognition. Depending on your organization and workload security criteria and requirements, this scenario might work from both security and user experience point of views. Additionally, we can enhance the security factor by chaining multiple Auth Challenges not only based on the user photo, but also a combination of Liveness Detection, a combination of their document numbers used for signing up and other additional MFA’s.

Since this is an entirely Serverless-based solution, you can customize it as your requirements arise using AWS Lambda functions. You can read more on custom authentication in our developer guide.


  • All the resources from the implementation mentioned above are available at GitHub. You can clone, change, deploy and run it yourself.
  • You can deploy this solution directly from the AWS Serverless Application Repository.

About the author

Enrico is a Solutions Architect at Amazon Web Services. He works in the Enterprise segment and works helping customers from different business leveraging their Cloud Journey. With more than 10 years working in Solutions Architecture and Engineering, and DevOps, Enrico acted directly with many customers designing, implementing and deploying several enterprise solutions.


from AWS Developer Blog

How ironSource built a multi-purpose data lake with Upsolver, Amazon S3, and Amazon Athena

How ironSource built a multi-purpose data lake with Upsolver, Amazon S3, and Amazon Athena

ironSource, in their own words, is the leading in-app monetization and video advertising platform, making free-to-play and free-to-use possible for over 1.5B people around the world. ironSource helps app developers take their apps to the next level, including the industry’s largest in-app video network. Over 80,000 apps use ironSource technologies to grow their businesses.

The massive scale in which ironSource operates across its various monetization platforms—including apps, video, and mediation—leads to millions of end-devices generating massive amounts of streaming data. They need to collect, store, and prepare data to support multiple use cases while minimizing infrastructure and engineering overheads.

This post discusses the following:

  • Why ironSource opted for a data lake architecture based on Amazon S3.
  • How ironSource built the data lake using Upsolver.
  • How to create outputs to analytic services such as Amazon Athena, Amazon ES, and Tableau.
  • The benefits of this solution.

Advantages of a data lake architecture

After working for several years in a database-focused approach, the rapid growth in ironSource’s data made their previous system unviable from a cost and maintenance perspective. Instead, they adopted a data lake architecture, storing raw event data on object storage, and creating customized output streams that power multiple applications and analytic flows.

Why ironSource chose an AWS data lake

A data lake was the right solution for ironSource for the following reasons:

  • Scale – ironSource processes 500K events per second and over 20 billion events daily. The ability to store near-infinite amounts of data in S3 without preprocessing the data is crucial.
  • Flexibility – ironSource uses data to support multiple business processes. Because they need to feed the same data into multiple services to provide for different use cases, the company needed to bypass the rigidity and schema limitations entailed by a database approach. Instead, they store all the original data on S3 and create ad-hoc outputs and transformations as needed.
  • Resilience – Because all historical data is on S3, recovery from failure is easier, and errors further down the pipeline are less likely to affect production environments.

Why ironSource chose Upsolver

Upsolver’s streaming data platform automates the coding-intensive processes associated with building and managing a cloud data lake. Upsolver enables ironSource to support a broad range of data consumers and minimize the time DevOps engineers spend on data plumbing by providing a GUI-based, self-service tool for ingesting data, preparing it for analysis, and outputting structured tables to various query services.

Key benefits include the following:

  • Self-sufficiency for data consumers – As a self-service platform, Upsolver allows BI developers, Ops, and software teams to transform data streams into tabular data without writing code.
  • Improved performance – Because Upsolver stores files in optimized Parquet storage on S3, ironSource benefits from high query performance without manual performance tuning.
  • Elastic scaling – ironSource is in hyper-growth, so needs elastic scaling to handle increases in inbound data volume and peaks throughout the week, reprocessing of events from S3, and isolation between different groups that use the data.
  • Data privacy – Because ironSource’s VPC deploys Upsolver with no access from outside, there is no risk to sensitive data.

This post shows how ironSource uses Upsolver to build, manage, and orchestrate its data lake with minimal coding and maintenance.

Solution Architecture

The following diagram shows the architecture ironSource uses:

Architecture showing Apache Kafka with an arrow pointing left to Upsolver. Upsolver contains stream ingestion, schemaless data management and stateful data processing, it has two arrows coming out the bottom, each going to S3, one for raw data, the other for parquet files. Upsolver box has an arrow pointing right to a Query Engines box, which contains Athena, Redshift and Elastic. This box has a an arrow pointing right to Use cases, which contains product analytics, campaign performance and customer dashboards.

Streaming data from Kafka to Upsolver and storing on S3

Apache Kafka streams data from ironSource’s mobile SDK at a rate of up to 500K events per second. Upsolver pulls data from Kafka and stores it in S3 within a data lake architecture. It also keeps a copy of the raw event data while making sure to write each event exactly one time, and stores the same data as Parquet files that are optimized for consumption.

Building the input stream in Upsolver:

Using the Upsolver GUI, ironSource connects directly to the relevant Kafka topics and writes them to S3 precisely one time. See the following screenshot.

Image of the Upsolver UI showing the "Data Sources" tab is open to the "Create a Kafka Data Source" page with "Mobile SDK Cluster" highlighted under the "Compute Cluster" section.

After the data is stored in S3, ironSource can proceed to operationalize the data using a wide variety of databases and analytic tools. The next steps cover the most prominent tools.

Output to Athena

To understand production issues, developers and product teams need access to data. These teams can work with the data directly and answer their own questions by using Upsolver and Athena.

Upsolver simplifies and automates the process of preparing data for consumption in Athena, including compaction, compression, partitioning, and creating and managing tables in the AWS Glue Data Catalog. ironSource’s DevOps teams save hundreds of hours on pipeline engineering. Upsolver’s GUI creates each table one time, and from that point onwards, data consumers are entirely self-sufficient. To ensure queries in Athena run fast and at minimal cost, Upsolver also enforces performance-tuning best practices as data is ingested and stored on S3. For more information, see Top 10 Performance Tuning Tips for Amazon Athena.

Athena’s serverless architecture further compliments this independence, which means there’s no infrastructure to manage and analysts don’t need DevOps to use Amazon Redshift or query clusters for each new question. Instead, analysts can indicate the data they need and get answers.

Sending tables to Athena in Upsolver

In Upsolver, you can declare tables with associated schema using SQL or the built-in GUI. You can expose these tables to Athena through the AWS Glue Data Catalog. Upsolver stores Parquet files in S3 and creates the appropriate table and partition information in the AWS Glue Data Catalog by using Create and Alter DDL statements. You can also edit these tables with Upsolver Output to add, remove, or change columns. Upsolver automates the process of recreating table data on S3 and altering the metadata in the AWS Glue Data Catalog.

Creating the table

Image of the Upsolver UI showing the "Outputs" tab is open to the "Mobile SDK Data" page.

Sending the table to Amazon Athena

Image of the Upsolver UI showing the "Run Parameters" dialog box is open, having arrived there from the "Mobile SDK Data" page noted in the previous image.

Editing the table option for Outputs

Image on the "Mobile SDK Data" page showing the drop down menu from the 3 dots in the upper left with "Edit" highlighted.

Modifying an existing table in the Upsolver Output

Image showing "Alter Existing Table" with a radio button selected, along with a blurb that states "The changes will affect the existing rable from the time specific. Any data already written after that time with be deleted. The previous output will stop once it finishes processing all the data up to the specified time." Below that is a box showing an example data and time. The other option with a radio button not selected is "Create New Table" with the blurb "A new table will be created. The existing table and output will not be affected in any way by this operation. The buttons at the bottom are "Next" and "Cancel," with "Next" selected.

Output to BI platforms

IronSource’s BI analysts use Tableau to query and visualize data using SQL. However, performing this type of analysis on streaming data may require extensive ETL and data preparation, which can limit the scope of analysis and create reporting bottlenecks.

IronSource’s cloud data lake architecture enables BI teams to work with big data in Tableau. They use Upsolver to enrich and filter data and write it to Redshift to build reporting dashboards, or send tables to Athena for ad-hoc analytic queries. Tableau connects natively to both Redshift and Athena, so analysts can query the data using regular SQL and familiar tools, rather than relying on manual ETL processes.

Creating a reduced stream for Amazon ES

Engineering teams at IronSource use Amazon ES to monitor and analyze application logs. However, as with any database, storing raw data in Amazon ES is expensive and can lead to production issues.

Because a large part of these logs are duplicates, Upsolver deduplicates the data. This reduces Amazon ES costs and improves performance. Upsolver cuts down the size of the data stored in Amazon ES by 70% by aggregating identical records. This makes it viable and cost-effective despite generating a high volume of logs.

To do this, Upsolver adds a calculated field to the event stream, which indicates whether a particular log is a duplicate. If so, it filters the log out of the stream that it sends to Amazon ES.

Creating the calculated field

Image showing the Upsolver UI with the "Outputs" tab selected, showing the "Create Calcuated Field" page.

Filtering using the calculated field

Upsolver UI showing the "Outputs" tab selected, on the "Create Filter" page.


Self-sufficiency is a big part of ironSource’s development ethos. In revamping its data infrastructure, the company sought to create a self-service environment for dev and BI teams to work with data, without becoming overly reliant on DevOps and data engineering. Data engineers can now focus on features rather than building and maintaining code-driven ETL flows.

ironSource successfully built an agile and versatile architecture with Upsolver and AWS data lake tools. This solution enables data consumers to work independently with data, while significantly improving data freshness, which helps power both the company’s internal decision-making and external reporting.

Some of the results in numbers include:

  • Thousands of engineering hours saved – ironSource’s DevOps and data engineers save thousands of hours that they would otherwise spend on infrastructure by replacing manual, coding-intensive processes with self-service tools and managed infrastructure.
  • Fees reduction – Factoring infrastructure, workforce, and licensing costs, Upsolver significantly reduces ironSource’s total infrastructure costs.
  • 15-minute latency from Kafka to end-user – Data consumers can respond and take action with near real-time data.
  • 9X increase in scale – Currently at 0.5M incoming events/sec and 3.5M outgoing events/sec.

“It’s important for every engineering project to generate tangible value for the business,” says Seva Feldman, Vice President of Research and Development at ironSource Mobile. “We want to minimize the time our engineering teams, including DevOps, spend on infrastructure and maximize the time spent developing features. Upsolver has saved thousands of engineering hours and significantly reduced total cost of ownership, which enables us to invest these resources in continuing our hypergrowth rather than data pipelines.”

The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.


About the Authors

Seva Feldman is Vice President of R&D at ironSource Mobile.
With over two decades of experience senior architecture, DevOps and engineering roles, Seva is an expert in turning operational challenges into opportunities for improvement.


Eran Levy is the Director of Marketing at Upsolver.





Roy Hasson is the Global Business Development Lead of Analytics and Data Lakes at AWS.. He works with customers around the globe to design solutions to meet their data processing, analytics and business intelligence needs. Roy is big Manchester United fan, cheering his team on and hanging out with his family.




from AWS Big Data Blog

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

A long time ago, in a galaxy far far away, ‘threads’ were a programming novelty rarely used and seldom trusted. In that environment, the first PostgreSQL developers decided forking a process for each connection to the database is the safest choice. It would be a shame if your database crashed, after all.

Since then, a lot of water has flown under that bridge, but the PostgreSQL community has stuck by their original decision. It is difficult to fault their argument – as it’s absolutely true that:

  • Each client having its own process prevents a poorly behaving client from crashing the entire database.
  • On modern Linux systems, the difference in overhead between forking a process and creating a thread is much lesser than it used to be.
  • Moving to a multithreaded architecture will require extensive rewrites.

However, in modern web applications, clients tend to open a lot of connections. Developers are often strongly discouraged from holding a database connection while other operations take place. “Open a connection as late as possible, close a connection as soon as possible”. But that causes a problem with PostgreSQL’s architecture – forking a process becomes expensive when transactions are very short, as the common wisdom dictates they should be. In this post, we cover the pros and cons of PostgreSQL connection pooling.

PostgreSQL Architecture DiagramThe PostgreSQL Architecture | Source

The Connection Pool Architecture

Using a modern language library does reduce the problem somewhat – connection pooling is an essential feature of most popular database-access libraries. It ensures ‘closed’ connections are not really closed, but returned to a pool, and ‘opening’ a new connection returns the same ‘physical connection’ back, reducing the actual forking on the PostgreSQL side.

Visual Representation of a Connection PoolThe architecture of a generic connection-pool

However, modern web applications are rarely monolithic, and often use multiple languages and technologies. Using a connection pool in each module is hardly efficient:

  • Even with a relatively small number of modules, and a small pool size in each, you end up with a lot of server processes. Context-switching between them is costly.
  • The pooling support varies widely between libraries and languages – one badly behaving pool can consume all resources and leave the database inaccessible by other modules.
  • There is no centralized control – you cannot use measures like client-specific access limits.

As a result, popular middlewares have been developed for PostgreSQL. These sit between the database and the clients, sometimes on a seperate server (physical or virtual) and sometimes on the same box, and create a pool that clients can connect to. These middleware are:

  • Optimized for PostgreSQL and its rather unique architecture amongst modern DBMSes.
  • Provide centralized access control for diverse clients.
  • Allow you to reap the same rewards as client-side pools, and then some more (we will discuss these more in more detail in our next posts)!

PostgreSQL Connection Pooler Cons

A connection pooler is an almost indispensable part of a production-ready PostgreSQL setup. While there is plenty of well-documented benefits to using a connection pooler, there are some arguments to be made against using one:

  • Introducing a middleware in the communication inevitably introduces some latency. However, when located on the same host, and factoring in the overhead of forking a connection, this is negligible in practice as we will see in the next section.
  • A middleware becomes a single point of failure. Using a cluster at this level can resolve this issue, but that introduces added complexity to the architecture.

    Redundant pgBouncer instances to prevent single point of failureRedundancy in middleware to avoid Single-Point-of-Failure | Source

  • A middleware implies extra costs. You either need an extra server (or 3), or your database server(s) must have enough resources to support a connection pooler, in addition to PostgreSQL.
  • Sharing connections between different modules can become a security vulnerability. It is very important that we configure pgPool or PgBouncer to clean connections before they are returned to the pool.
  • The authentication shifts from the DBMS to the connection pooler. This may not always be acceptable.

    PgBouncer Authentication ModelPgBouncer Authentication Model | Source

  • It increases the surface area for attack, unless access to the underlying database is locked down to allow access only via the connection pooler.
  • It creates yet another component that must be maintained, fine tuned for your workload, security patched often, and upgraded as required.

Should You Use a PostgreSQL Connection Pooler?

However, all of these problems are well-discussed in the PostgreSQL community, and mitigation strategies ensure the pros of a connection pooler far exceed their cons. Our tests show that even a small number of clients can significantly benefit from using a connection pooler. They are well worth the added configuration and maintenance effort.

In the next post, we will discuss one of the most popular connection poolers in the PostgreSQL world – PgBouncer, followed by Pgpool-II, and lastly a performance test comparison of these two PostgreSQL connection poolers in our final post of the series.

from High Scalability

How AWS is helping to open source the future of robotics

How AWS is helping to open source the future of robotics

Our robot overlords may not take over anytime soon, but when they do, they’ll likely be running ROS. ROS, or Robot Operating System, was launched over a decade ago to unite developers in building “a collection of tools, libraries, and conventions that aim to simplify the task of creating complex and robust robot behavior across a wide variety of robotic platforms,” as the ROS project page describes. To succeed, ROS has required deep collaboration among a community of roboticist developers, including more recent but significant involvement from AWS engineers like Thomas Moulard and Miaofei Mei.

At AWS, our interest and involvement in the ROS community have evolved and deepened over time, in concert with our customers’ need for a robust, robot-oriented operating system.

Dōmo arigatō, Misutā Robotto

Over the past 10 years, ROS has become the industry’s most popular robot software development framework. According to ABI Research, by 2024 roughly 55% of the world’s robots will include a ROS package. However, getting there has not been and will not be easy.

The ideas behind ROS germinated at Stanford University in the mid 2000s, the brainchild of student researchers Keenan Wyrobek and Eric Berger. Their goal? Help robotics engineers stop reinventing the robot, as it were, given that there was “too much time dedicated to re-implementing the software infrastructure required to build complex robotics algorithms…[and] too little time dedicated to actually building intelligent robotics programs.“ By late 2007/early 2008, they joined efforts with Scott Hassan’s Willow Garage, a robotics incubator, which began funding their robotics work under the auspices of its Personal Robotics Program.

The Robot Operating System (ROS) was born.

Given its academic origins, it’s perhaps not surprising that the original ROS (ROS 1) was mostly inspired by an academic and hobbyist community that grew to tens of thousands of developers. Though there were large industrial automation systems running ROS, ROS 1 didn’t yet yield the industrial-grade foundation that many customers required. Despite thousands of add-on modules that extended ROS 1 in creative ways, it lacked basic security measures and real-time communication, and only ran on Linux. Lacking alternatives, enterprises kept turning to ROS 1 to build large robotics businesses, though evolving ROS 1 to suit their purposes required considerable effort.

By the late 2010s, these industrial-grade demands were starting to strain ROS 1 as the world took ROS well beyond its initial intended scope, as ROS developer Brian Gerkey wrote in Why ROS 2? To push ROS forward into more commercial-grade applications, Open Robotics, the foundation that shepherds ROS development, kicked off ROS 2, a significant upgrade on ROS 1 with multi-platform support, real-time communications, multi-robot communication, small-embedded device capabilities, and more.

Open Robotics opted not to build these capabilities into ROS 1 because, Gerkey notes, “[G]iven the intrusive nature of the changes that would be required to achieve the benefits that we are seeking, there is too much risk associated with changing the current ROS system that is relied upon by so many people.” The move to ROS 2 therefore broke the API upon which the ROS developer community depended, so the ROS community had to start afresh to build out an ecosystem of modules to extend ROS 2, while also improving its stability, bug by bug.

That was when AWS engineers, along with a number of other collaborators, dug in.

While a variety of teams at Amazon had been using ROS for some time, customer interest in ROS motivated a new phase of our ROS engagement. Given the importance of ROS 2 to our customers’ success, the AWS Robotics team went to work on improving ROS. According to AWS engineer Miaofei Mei, an active contributor to ROS 2, this meant that we began in earnest to contribute features, bug fixes, and usability improvements to ROS 2.

Today, the AWS Robotics team actively contributes to ROS 2 and participates on the ROS 2 Technical Steering Committee. With over 250 merged pull requests to the rosdistro and ros2 organization repos, the AWS Robotics team’s contributions run wide and deep, and are not limited to the most recent ROS 2 release (Dashing). In the Dashing release, we have been fortunate to be able to contribute two major features: new concepts for Quality of Service (QoS) callbacks in rclcpp and rclpy, as well as new QoS features for Deadline, Lifespan, and Liveliness policies. In addition, we have made logging improvements (e.g., implementation of rosout and the ability to integrate a third-party logging solution like log4cxx or spdlog), added runtime analysis tools (ROS2 Sanitizer Report and Analysis), and contributed Secure-ROS2 (SROS 2) improvements (like policy generation for securing nodes), among others.

Helping customers, one robot at a time

While AWS actively contributes to ROS 2, we also benefit from others’ contributions. Indeed, the initial impetus to embrace ROS had everything to do with community, recalls AWS RoboMaker product manager Ray Zhu: “ROS has been around for over 10 years, with tens of thousands of developers building packages on it. For us to catch up and build something similar from the ground up would have been years of development work.” Furthermore, AWS customers kept asking AWS to help them with their robotics applications, and would prefer to build on an open industry standard.

To answer this customer request, we launched AWS RoboMaker at re:Invent 2018, with the goal of making it easy to “develop…code inside of a cloud-based development environment, test it in a Gazebo simulation, and then deploy…finished code to a fleet of one or more robots,” as AWS chief evangelist Jeff Barr wrote at RoboMaker’s launch. AWS RoboMaker extends ROS 2 with cloud services, making it easier for developers to extend their robot applications with AWS services such as Amazon Lex or Amazon Kinesis.

For example, AWS RoboMaker customer Robot Care Systems (RCS), which “use[s] robotics to help seniors, people with Parkinson’s disease, and people with disabilities move about more independently,” turned to AWS RoboMaker to extend its Lea robots. The company had ideas it wanted to implement (e.g., the ability to collect and share movement and behavioral data with patients’ physicians), but didn’t know how to connect to the cloud. According to Gabriel Lopes, a control and robotics scientist at RCS, “It was a revelation seeing how easily cloud connectivity could be accomplished with RoboMaker, just by configuring some scripts.”

Additionally, to help accelerate the development of analysis and validation tools in ROS 2 on behalf of AWS customers, the AWS RoboMaker team contracted established ROS community developers like PickNik to assist in porting RQT with enhancements to ROS 2, as detailed by PickNick co-founder Michael Lautman.

Contributing to our robot future

Such open source collaboration is by no means new to AWS, but it’s becoming more pronounced as we seek to improve how we serve customers. Working through Open Robotics means we cannot always move as fast as we (or our customers) would like. By working within the ROS 2 community we ensure that our customers remain firmly rooted in the open source software commons, which enables them to benefit from the combined innovations of many contributors. In the words of AWS engineer (and ROS contributor) Thomas Moulard, “It would be a hard sell for our customers to tell them to trust us and go proprietary with us. Instead, we’re telling them to embrace the biggest robotics middleware open source community. It helps them to trust us now and have confidence in the future.

With AWS initiatives in the ROS community like the newly-announced ROS 2 Tooling Working Group (WG) led by Moulard and other AWS Robotics team members, AWS hopes to attract more partners, customers, and even competitors to join us on our open source journey. While there remain pockets of Amazon teams using ROS purely as customers, the bulk of our usage now drives a rising number of significant contributions upstream to ROS 2.

Why? Well, as mentioned, our contributing back offers significant benefits for customers. But it’s more than that: it’s also great for AWS employees. As Mei said, “It feels good to be part of a community, and not just a company.”

from AWS Open Source Blog