Category: Management Tools

Maximizing features and functionality in AWS CloudTrail

Maximizing features and functionality in AWS CloudTrail

Thanks to the following AWS CloudTrail experts for their work on this post:

  • Avneesh Singh, Senior Product Manager, AWS CloudTrail
  • Jeff McRae, Software Development Manager, AWS CloudTrail
  • Keith Robertson, Software Development Manager, AWS CloudTrail
  • Susan Ferrell, Senior Technical Writer, AWS

Are you taking advantage of all the features and functionality that AWS CloudTrail offers? Here are some best practices, tips, and tricks for working with CloudTrail to help you get the most out of using it.

This service is enabled for you when you create your AWS account, and it’s easy to set up a trail for continuous logging and history. This post answers some frequently asked questions that people ask about CloudTrail.

What is CloudTrail?

CloudTrail is an AWS service that enables governance, compliance, and operational and risk auditing of your AWS account. Use the information recorded in CloudTrail logs and in the CloudTrail console to review information about actions taken by a user, role, or AWS service. Each action is recorded as an event in CloudTrail, including actions taken in the AWS Management Console and with the AWS CLI, AWS SDKs, and APIs.

How does CloudTrail work across Regions?

Keep AWS Regions in mind when working with CloudTrail. CloudTrail always logs events in the AWS Region where they occur, unless they are global service events.

If you sign in to the console to perform an action, the sign-in event is a global service event, and is logged to any multi-region trail in the US East (N.Virginia) Region, or to a single-region trail in any Region that contains global service events. But if you create a trail that only logs events in US East (Ohio), without global service events, a sign-in event would not be logged.

How do I start using CloudTrail?

Create a trail! Although CloudTrail is enabled for you by default in the CloudTrail console, the Event history only covers the most recent 90 days of management event activity. Anything that happened before then is no longer available—unless you create a trail to keep an ongoing record of events.

When creating your first trail, we recommend creating one that logs management events for all Regions. Here’s why:

  • Simplicity. A single trail that logs management events in all Regions is easier to maintain over time. For example, if you create a trail named LogsAllManagementEventsInAllRegions, it’s obvious what events that trail logs, isn’t it? No matter how your usage changes or how AWS changes, the scope remains the same. Over time, as new AWS Regions are added, and you work in more than one AWS Region, that trail still does what it says: logs all management events in every AWS Region. You have a complete record of all management events that CloudTrail logs.
  • No surprises. Global service events are included in your logs, along with all other management events. If you create a trail in a single AWS Region, you only log events in that Region—and global service events may not necessarily be logged in that Region.
  • You know what you’re paying. If this is your first trail, and you log all management events in all AWS Regions, it’s free. Then, create additional trails to meet your business needs. For example, you can add a second trail for management events that copies all management events to a separate S3 bucket for your security team to analyze, and you are charged for the second trail. If you add a trail to log data events for Amazon S3 buckets or AWS Lambda functions, even if it’s the first trail capturing data events, you are charged for it, because a trail that captures data events always incurs charges. For more information about CloudTrail costs, see AWS CloudTrail Pricing.

How do I manage costs for CloudTrail?

That’s a common request. Here are some ways to get started:

I created a trail. What should I do next?

Consider two important things: who has access to your log files, and how to get the most out of those log files. Then do the following:

Understanding log files and what’s in them helps you become familiar with your AWS account activity and spot unusual patterns.

Over time, you’ll find there are many log files with a lot of data. CloudTrail makes a significant amount of data available to you. To get the most out of the data collected by CloudTrail, and to make that data actionable, you might want to leverage the query power of Amazon Athena, an interactive, serverless query service that makes it easy for anyone with SQL skills to quickly analyze large-scale datasets. You could also set up Amazon CloudWatch to monitor your logs and notify you when specific activities occur. For more information, see AWS Service Integrations with CloudTrail Logs.

Is there a better way to log events for several AWS accounts instead of creating a trail for each one?

Yes, there is! To manage multiple AWS accounts, you can create an organization in AWS Organizations. Then create an organization trail, which is a single trail configuration that is replicated to all member accounts automatically. It logs events for all accounts in an organization, so you can log and analyze activity for all organization accounts.

Only the master account for an organization can create or modify an organization trail. This makes sure that the organization trail captures all log information as configured for that organization. An organization trail’s configuration cannot be modified, enabled, or disabled by member accounts. For more information, see Creating a Trail for an Organization.

Why can’t I find a specific event that I’m looking for?

While the log files from multi-region trails contain events from all Regions, the events in Event history are specific to the AWS Region where they’re logged.

If you don’t see events that you expect to find, double-check which AWS Region you’re logged into in the selector. If necessary, change the setting to the AWS Region where the event occurred.

Also, keep in mind that the console only shows you events that occurred up to 90 days ago in your AWS account. If you’re looking for an older event, you won’t see it. That’s one reason it’s so important to have a trail that logs events to an S3 bucket; that data stays there until you decide not to keep it.

What are some best practices for working with CloudTrail?

Be familiar with your CloudTrail logs. Having a general familiarity and understanding of your CloudTrail log file data and structure help you spot and troubleshoot any issues that might arise.

Here are some things to avoid doing under most circumstances:

Avoid creating trails that log events for a single AWS Region

Although CloudTrail supports this, we recommend against creating this kind of trail for several reasons.

Some AWS services appear as “global” (the action can be called locally, but is run in another AWS Region), but they do not log global service events to CloudTrail. A trail that logs events in all AWS Regions shows data about all events logged for your AWS account, regardless of the AWS Region in which they occur.

For example, Organizations is a global service, but it only logs events in the US East (N. Virginia) Region. If you create a trail that only logs events in US East (Ohio), you do not see events for this service in the log files delivered to your S3 bucket.

Also, a trail that logs events in a single AWS Region can be confusing when it comes to cost management. Only the first instance of a logged event is free. If you have a trail that logs events in a single AWS Region, and you create a multi-region trail, it incurs costs for the second and any subsequent trails. For more information, see AWS CloudTrail Pricing.

Avoid using the create-subscription and update-subscription commands to create and manage your trails

We recommend that you do not use the create-subscription or update-subscription commands, because these commands are on a deprecation path, and might be removed in a future release of the CLI. Instead, use the create-trail and update-trail commands. If you’re programmatically creating trails, use an AWS CloudFormation template.

What else should I know?

Talk to us! We always want to hear from you! Tell us what you think about CloudTrail, and let us know features that you want to see or content that you’d like to have. You can reach us through the following resources:

from AWS Management Tools Blog

Auto-populate instance details by integrating AWS Config with your ServiceNow CMDB

Auto-populate instance details by integrating AWS Config with your ServiceNow CMDB


Many AWS customers either integrate ServiceNow into their existing AWS services or set up both ServiceNow and AWS services for simultaneous use. One challenge in this use case is the need to update your configuration management database (CMDB) when a new spin-up instance appears in AWS.

This post demonstrates how to integrate AWS Config and ServiceNow so that when a new Amazon EC2 instance is created, Amazon SNS triggers a notification. This notification creates a server record in the CMDB and tests your setup by creating an EC2 instance from a sample AWS CloudFormation stack.


Use AWS CloudFormation to provision infrastructure resources from a template automatically, and use AWS Config to monitor these resources. SNS provides topics for pushing messages for these resources. Use AWS Config to provide the information to ServiceNow, enabling it to create a CMDB record automatically.

This is done in five stages:

  1. Configure ServiceNow.
  2. Create an SNS topic and subscription.
  3. Confirm the SNS subscription in ServiceNow.
  4. Create a handler for the subscription in ServiceNow.
  5. Configure AWS Config.

Configure ServiceNow

Use a free ServiceNow developer instance to do the work. If you already have one, feel free to use your own.

  1. Log in to the ServiceNow Developer page, and request a developer instance.
  2. Log in to the developer instance as an administrator. Make sure to remember your login credentials. These are used later when configuring SNS topic subscription URLs.
  3. Navigate to System Applications. Choose Studio, then Import From Source Control.
  4. On the Import Application screen, enter the following URL:
  5. Leave both the User name and Password fields empty, and then choose Import.
  6. Close the Studio browser tab.
  7. Refresh your ServiceNow browser tab and navigate to SNS. Notice in the left pane that there are now three new navigation links.

Note: in the above image, “AWS SNS” refers to the app name, not to Amazon SNS.

Create an SNS topic and subscription

Perform the following procedures to create an SNS topic and subscription:

  1. Log in to the SNS console, and select the US-East (N. Virginia) Region.
  2. In the left pane, choose Topics, Create New Topic.
  3. Give the topic a name, make the display name ServiceNow, and choose Create Topic.
  4. Select the Amazon Resource Name (ARN) link for the topic that you just created.
  5. Choose Create Subscription.
  6. Choose HTTPS protocol.
  7. For Endpoint, use the administrator password that you received when you acquired the free ServiceNow developer instance. Then enter the developer instance link, which is rendered like the following:
    • https://admin:<ServiceNow admin password>@<your developer instance>
  8. Choose Create Subscription.
    Your new subscription is pending confirmation.

Confirm the SNS subscription in ServiceNow

Before allowing SNS to send messages to ServiceNow, confirm the subscription on ServiceNow. At this point, AWS already sent a handshake request, which is awaiting confirmation inside your ServiceNow instance.

  1. On your ServiceNow browser tab, navigate to SNS, then choose Subscriptions. Notice that AWS created a new record.
  2. Open the subscription by choosing ServiceNow, then choose Confirm Subscription. Stay on this page to create a handler in the next section.

Create a handler for the subscription in ServiceNow

Now, set up ServiceNow to be able to absorb received messages from AWS. Create a handler that’s able to create a new record in the CMDB Server table (cmdb_ci_server) whenever a new EC2 instance is created from a sample AWS CloudFormation stack.

To set up the handler, follow these steps:

  1. At the bottom of the Subscriptions form, for Handler Related , choose New and then provide a name for the handler, such as Create CMDB Server from EC2.
  2. Enter the following code inside the function:
    var webserver = new GlideRecord("cmdb_ci_server"); 
    webserver.initialize(); = "AWS WebServer "+message.configurationItem.configuration.launchTime ; 
    webserver.short_description = "Monitoring is "+message.configurationItem.configuration.monitoring.state+" and Instance Type is "+message.configurationItem.configuration.instanceType ; 
    webserver.asset_tag = message.configurationItem.configuration.instanceId ; 
  3. Choose Submit
  4. Configure AWS Config

  1. In the SNS console, select the US-East (N. Virginia) Region.
  2. In the left navigation pane, choose Settings. For Recording, make sure that the value is On.
  3. Under Resources Type to Record, for All Resources, select both check boxes:
    • Record all resources supported in this region
    • Include global resources (including IAM resources)
  4. Choose Choose a topic from your account.
  5. Select the Amazon Resource Name (ARN) link for the topic that you just created.
  6. Choose Save.

Testing the integration

You can test this integration by creating a stack from the AWS CloudFormation sample templates, which trigger recording in AWS Config. This process then creates SNS notifications, which creates a configuration item in the ServiceNow CMDB.

  1. In the AWS CloudFormation console, choose Create stack.
  2. Select a sample template.
  3. Under Specify Details, enter the following information:

    Note: the above image, shows sample information.

  4. Choose Next.
  5. In the left navigation pane, choose Options, provide tags if needed, and then choose Next.
  6. At the bottom of the review page, choose Create. Wait for the stack creation to complete.
  7. Navigate to ServiceNow, then Server to check whether a server was created.

If you see a new server entry, you successfully integrated AWS Config with the ServiceNow CMDB.


This post shows one way to integrate AWS Config with your ServiceNow CMDB. When an instance is created in AWS using AWS CloudFormation, the details are captured as configuration items in the CMDB Server table.

With this process, you can use Handlers in ServiceNow to update the record with instance details. This handler can be customized to provide you with the option to scale this integration. You can get updated instance details as well as additional details that you may want.

You can use this mechanism as a trigger to send notifications and perform actions including discovery, workflow, and more. By making a small change (for example, adding a tag) across a list of resource types, you can use this solution to bypass discovery needs and discover existing resources. This triggers change recording in AWS Config and then creates those resources in the CMDB.

Additionally, we have AWS Service Catalog Connector for ServiceNow:

How to install and configure the AWS Service Catalog Connector for ServiceNow

How to enable self-service Amazon WorkSpaces by using AWS Service Catalog Connector for ServiceNow

About the Author

Rahul Goyal is a New York-based Senior Consultant for AWS Professional Services in Global Specialty Practice. He has been working in cloud technologies for more than a decade. Rahul has been leading Operations Integration engagements to help various AWS customers be production ready with their cloud operations. When he is not with a Customer he takes his Panigale to Track Days for racing in summers and enjoys skiing in winters.


from AWS Management Tools Blog

Enhancing configuration management at Verizon using AWS Systems Manager

Enhancing configuration management at Verizon using AWS Systems Manager

In large enterprise organizations, it’s challenging to maintain standardization across environments. This is especially true if these environments are provisioned in a self-service manner—and even more so when new users access these provisioning services.

In this post, I describe how we at Verizon found a balance operating between agility, governance, and standardization for our AWS resources. I walk you through one of the solutions that we use to enable new users to provision AWS resources and configure application software. The solution uses ServiceNow and the following AWS services:

  • Systems Manager
  • AWS Service Catalog
  • AWS CloudFormation

Verizon seeks to provide a standardized AWS resource-provisioning service to new users. We needed a solution that incorporates auditing best practices and post-deployment configuration management to any newly provisioned environment. These best practices must work within a fully auditable self-service model and require that:

  • All appropriate resource-provisioning service requests are life-cycle appropriate.
  • The configuration management is defined and automatically applied as needed.

We wanted to provide a better user experience for our new users and help them provision resources in compliance with Verizon’s Governance and Security practices.

Shopping cart experience using AWS Service Catalog and ServiceNow
To accomplish these requirements, we use AWS Service Catalog to manage all our blueprint AWS CloudFormation templates (after being cleared through CFN-Nag). We then publish them as products in ServiceNow using the AWS Service Catalog Connector for ServiceNow (for example, EC2 CloudFormation as a product).

End users get a shopping cart-like experience whenever they provision resources in their account. This process helps us maintain provisioned resources consistent across all accounts and meet our compliance requirements.

The products or AWS CloudFormation templates are published to AWS Service Catalog using an automated Jenkins pipeline triggered from a Git repository, as shown in the following diagram.



All the products or AWS CloudFormation templates are retrieved from the AWS Service Catalog using the AWS Service Catalog Connector for ServiceNow and display as products. Users see the following list of compliant products from the Service Portal UI on ServiceNow.



When the user selects a product and provisions it in their account, ServiceNow makes backend calls to Verizon applications to do compliance checks. Then, it makes a call to AWS Service Catalog to provision the product. After the provisioning is successful, the user sees the list of provisioned products. The user can also use the API to provision the product.



Configuration management using Systems Manager
After the product is provisioned, users need the ability to configure their instances in a secure way using native AWS services. As shown earlier, a user uses the EC2 product and provisions it using the AWS Service Catalog. The user has an EC2 instance to configure his application.

At Verizon, we use Ansible for post-provisioning the configuration management of EC2 instances. After evaluating several options, we decided that Systems Manager was a perfect fit to use as an AWS native configuration-management solution. We leveraged Systems Manager agents already baked into our AMIs. For example, we use the Systems Manager Run Command with a run ansible document to execute Ansible playbooks and a run a shell-script document to run bash commands. For more information, see Running Ansible Playbooks using EC2 Systems Manager Run Command and State Manager.

The flow
In the previous provisioning section, you saw how users provision resources using AWS CloudFormation. ServiceNow maintains information on what types of resources users try to provision. For example, if there’s a product with an EC2 resource, you can enable the Systems Manager Run Command to deploy the EC2 product from the ServiceNow UI, as shown in the following screenshot.



When a user selects the Systems Manager Run Command, it allows users to include inline shell scripts or an Ansible Playbook. They can then submit the script as part of the configuration management, as shown in the following sample script:


- hosts: local


   - name: Install Nginx

     apt: pkg=nginx state=installed update_cache=true


      - Start Nginx


   - name: Start Nginx

     service: name=nginx state=started


ServiceNow stores the information in its database for audit before it makes a Systems Manager API call to run the command on the selected EC2 instance. ServiceNow fetches the output using the command id from the previous command and shows it on the UI, as shown in the following screenshot.



We call this a post-provisioning workflow in ServiceNow, because it lets users do configuration actions after the provisioning is successful.

This solution is just one of many ways that Verizon helps users provision Verizon-compliant resources and deploy their applications in the AWS Cloud. We want to empower new cloud users to provision resources faster, with fewer clicks, but also in a secure manner that follows audit and compliance requirements.

About the Author

Krishna Gadiraju (GK) is an architect for the Cloud Governance and Cloud User Experience product teams at Verizon. He actively assists development teams with the migration of on-premise applications to the cloud while ensuring that the Verizon AWS accounts meet all security and other compliances. GK has AWS DevOps Professional and GCP Associate certifications. He is an active presenter at cloud conferences and can be reached at

from AWS Management Tools Blog

Creating and hydrating self-service data lakes with AWS Service Catalog

Creating and hydrating self-service data lakes with AWS Service Catalog

Organizations are evolving IT processes to include data lakes and supporting services. Your organization might start by looking to extend the self-service portals you built using AWS Service Catalog to create data lakes as well. A self-service portal lets users vend required AWS resources within the guardrails defined by your cloud center of excellence (CCOE) team. This removes the heavy lifting from the CCOE team and lets users build their own environments. With AWS Service Catalog, you can also define the constraints on which AWS resources your users can and can’t deploy.

For example, with an appropriately configured self-service portal that supports creation and hydration of data lakes for structured relational data, your users could do the following:

  • Vend an Amazon RDS database that they can launch only in private subnets.
  • Create an Amazon S3 bucket with versioning and encryption enabled.
  • Create an AWS DMS task that can hydrate only the chosen S3 bucket.
  • Launch an AWS Glue crawler that populates an AWS Glue Data Catalog for data from that chosen S3 bucket.

With adequately configured constraints and templates, you can be confident that your users follow the best practices around these services (that is, private subnets, encrypted buckets, and specific security groups).

In this post, I show you how to use AWS Service Catalog to create an IT self-service portal that lets you create a data lake and populate your Data Catalog.

Data lake basics

A data lake is a central repository of your structured and unstructured data that you use to store information in a separate environment from your active or compute space. A data lake enables diverse query capabilities, data science use cases, and discovery of new information models. For more information on data lakes, see Data Lakes and Analytics on AWS. Amazon S3 is an excellent choice for building your data lake on AWS because it offers multiple integrations. For more information, see Amazon S3 as the Data Lake Storage Platform.

Before you use your data lake for analytics and machine learning, you must first hydrate it—fill it with data—and create a Data Catalog containing metadata. AWS DMS works well for hydrating your data lake with structured data from your database. With that in place, you can use AWS Glue to automatically discover and categorize data, making it immediately searchable and queryable across data sources such as Glue ETL, Amazon Athena, Amazon Redshift Spectrum, and Amazon EMR.

Other AWS services such as Amazon Data pipeline can be used for hydrating a data lake from a structured or unstructured database in addition to AWS DMS. This post demonstrates a specific solution that uses AWS DMS. 

The manual data lake hydration process

The following diagram shows the typical data lake hydration and cataloging process for databases.

  1. Create a database, which various applications populate with data.
  2. Create an S3 bucket to which you can export a copy of the data.
  3. Create a DMS replication task that migrates the data from your database to your S3 bucket. You can also create an ongoing replication task that captures ongoing changes after you complete your initial migration. This process is called ongoing replication or change data capture (CDC).
  4. Run the DMS replication task.
  5. Create an AWS Glue crawler to crawl your S3 bucket and populate your AWS Glue Data Catalog. AWS Glue can crawl RDS too, for populating your Data Catalog; in this example, I focus on a data lake that uses S3 as its primary data source.
  6. Run the crawler.

For more information, see to following resources:

The automated, self-service data lake hydration process

Using AWS Service Catalog, you can set up a self-service portal that lets your end users request components from your data lake, along with tools to hydrate it with data and create a Data Catalog.

The diagram below shows the data lake hydration process using a self-service portal.

The automated hydration process using AWS Service Catalog consists of the following:

  1. An Amazon RDS database product. Because CCOE team controls the CloudFormation template that enables resource vending, your organization can maintain appropriate security measures. You can also tag subnets for specific teams and configure catalog such that self-service portal users only choose from an allowed list of subnets. You can codify RDS database best practices (such as Multi-AZ) in your CloudFormation template and simultaneously leave the decision points such as the size, engine, and number of read replicas for the database up to the user. With self-service actions, you can also further extend the RDS product to enable your users to start, stop, restart, and otherwise manage their own RDS database.
  2. An S3 bucket. By controlling the CloudFormation template that creates the S3 bucket, you can enable encryption at the source, as well as versioning, replication, and tags. Along with the S3 bucket, you can also allow your users to vend service-specific IAM roles configured to grant access to the S3 bucket. Your users can then use these roles for tasks such as:
    1. AWS Glue crawler task
    2. DMS replication task
    3. Amazon SageMaker execution role with access only to this bucket
    4. Amazon Elastic Compute Cloud (Amazon EC2) role for Amazon EMR
  3. A DMS replication task, which copies the data from your database into your S3 bucket. Users can then go to the console and start the replication task to hydrate the data lake at will.
  4. An AWS Glue crawler to populate the AWS Glue Data Catalog with metadata of files read from the S3 bucket.

To allow users from specific teams to vend these resources, you must associate an AWSServiceCatalogEndUserFullAccess managed policy with them. Your users also need IAM permissions to stop or start a crawler and a DMS task.

You can also configure the catalog to use launch constraints, which assume the appropriate, pre-configured IAM roles you configured and execute your CloudFormation template whenever your users activate specific resource. This provides your users capability to execute specific tasks, such as creating a DMS task or S3 bucket within guardrails you define.

After creating these resources, users can run the DMS task and AWS Glue crawler using the AWS console, finally hydrating and populating the Data Catalog.

You can try the above solution by deploying a sample catalog. The sample catalog solution creates a VPC, subnets, and IAM roles. It sets up a sample catalog with service products such as AWS Glue crawlers, DMS tasks, RDS, S3, and corresponding IAM roles for the AWS Glue crawler and DMS target. It also creates an end user and demonstrates how to allow that user to deploy RDS and DMS tasks using only the subnets created for them. The sample catalog also teaches you to configure launch constraints, so you don’t have to grant additional permissions to users. It contains an S3 product that vends service-specific IAM roles with access restricted to a specific S3 bucket.

Best practices for data lake hydration at scale

By configuring catalog in such a manner, you can make implementing the following best practices at scale easier by standardizing and automating:

  • Grant the least amount of privilege possible. IAM users should have an appropriate level of permissions to only do the task they must do.
  • Create resources such as S3 buckets with appropriate read/write permissions, with encryption and versioning enabled.
  • Use a team-specific key for database and DMS replication tasks, and do not spin up either in public subnets.
  • Give team-specific DMS and AWS Glue roles access to only the S3 bucket created for their individual team.
  • Do not enable users to spin up RDS and DMS resources in VPCs or subnets that do not belong to their teams.

With AWS CloudFormation, you can automate the manual work by writing a CloudFormation template. With AWS Service Catalog, you can make templates available to end users like data curators, who might not know all the AWS services in detail. With the self-service portal, your users would vend AWS resources only using the CloudFormation template that you standardize and into which you implement your security best practices.

With a self-service portal created using AWS Service Catalog, you can automate the process and leave decision points like RDS engine type, size of the database, VPC, and other configurations to your users. This helps you maintain an appropriate level of security, keeping the nuts and bolts of that security automated behind the scene.

Make sure that you understand how to control the AWS resource you deploy using AWS Service Catalog as well as the general vocabulary before you begin. In this post, you populate a self-service portal using a sample RDS database blueprint from AWS Service Catalog reference blueprints.

How to deploy a sample catalog

To deploy the sample catalog solution discussed earlier, follow these steps.


To deploy this solution, you need  administrator access to the AWS account.

Step 1: Deploy the AWS CloudFormation template

A CloudFormation template handles most of the heavy lifting of sample catalog setup:

  1. Download the sample CloudFormation template to your computer.
  2. Log in to your AWS account using a user account or role that has administrator access.
  3. In the AWS CloudFormation console, create a new stack in AWS CloudFormation in the us-east-1 Region.
  4. Under Choose a template section, choose Choose File, and select the yaml file that you downloaded earlier. Choose Next.
  5. Complete the wizard and choose Create.
  6. When the stack status changes to CREATE COMPLETE, select the stack and choose Outputs. Note the link in output (SwitchRoleSCEndUser) for switching to the AWS Service Catalog end-user role.

The templates this post provides are samples and not intended for production use. However, you can review the CloudFormation template to understand the infrastructure it creates.

Step 2: View the catalog

Next, you can view the catalog:

  1. In the console, in the left navigation pane, under Admin, choose Portfolio List.
  2. Choose Analytics Team portfolio.
  3. The sample catalog automatically populates the following for you:
    • An RDS database (MySQL) for vending a database instance
    • An S3 bucket and appropriate roles for vending the S3 bucket and IAM roles for the AWS Glue crawler and the DMS task
    • AWS Glue crawler
    • A DMS task

Step 3: Create an RDS database using AWS Service Catalog

For this post, set up an RDS database. To do so:

  1. Switch to my_service_catalog_end_user role by launching the link you noted in the output section during Step 1.
  2. Open this console to see the products available for you to launch as an end-user.
  3. Choose RDS Database (Mysql).
  4. Choose Launch Product. Specify the name as my-db.
  5. Choose v1.0, and choose Next.
  6. On the Parameters page, specify the following parameters and choose Next.
    • DBVPC: Choose the one with SC-Data-lake-portfolio in its name.
    • DBSecurityGroupName: Specify dbsecgrp.
    • DBSubnets: Choose private subnet 1 and private subnet 2.
    • DBSubnetGroupName: Specify dbsubgrp.
    • DBInputCIDR: Specify CIDR of the VPC. If you did not modify defaults in step 1, then this value is
    • DBMasterUsername: master.
    • DBMasterUserPassword: Specify a password that is at least 12 characters long. The password must include an uppercase letter, a lowercase letter, a special character, and a number. For example, dAtaLakeWorkshop123_.
    • Leave the remaining parameters as they are.
  7. On the Tag options page, choose Next (AWS Service Catalog automatically generates a mandatory tag here).
  8. On the Notifications page, choose Next.
  9. On the Review page, choose Launch.
  10. The status changes to Under Change/In progress. After AWS Service Catalog provisions the database, the status changes to Available/Succeeded. You can see the RDS connection string available in the output.

The output contains MasterJDBCConnectionString connection string, which includes the RDS endpoint (the underlined portion in the following example). You can use the same endpoint to connect to the database and create sample data.

Sample output:


This example vends an RDS database, but you can also automate the creation of an Amazon DynamoDB, Amazon Redshift cluster, Amazon Kinesis Data Firehose delivery stream, and other necessary AWS resources.

Step 4: Load sample data into your database (optional)

For security reasons, I provisioned the database in a private subnet. You must set up a bastion host (unless you have VPN or DirectConnect access) to connect it to your database. You can provision an Amazon Linux 2.0-based bastion host in the public subnet and then log on to the same. For more information, see Launch an Amazon EC2 instance.

The my_service_catalog_end_user does not have access to the Amazon EC2 console. Do this step with an alternate user that has permissions to launch an EC2 instance. After you launch an EC2 instance, connect to your EC2 instance.

Next, execute the following commands to create a simple database and a table with two rows:

  1. Install the MySQL client:
sudo yum install mysql
  1. Connect to the RDS that you provisioned:

mysql -h <RDS_endpoint_name> -P 3306 -u master -p

  1. Create a database:
create database mydb;
  1. Use the newly created database:
use mydb;
  1. Create a table called client_balance and populate it with two rows:

CREATE TABLE `client_balance` ( `client_id` varchar(36) NOT NULL, `balance_amount` float NOT NULL DEFAULT '0', PRIMARY KEY (`client_id`) );
INSERT INTO `client_balance` VALUES ('123',0),('124',1.0);
SELECT * FROM client_balance;

Step 5: Create an S3 bucket using AWS Service Catalog

Switch to the my_service_catalog_end_user role and follow the process outlined in Step 3 to provision a product from the S3 bucket and appropriate roles. If you are using a new AWS account that does not have dms-vpc-role and dms-cloudwatch-logs-role IAM roles, you can select N as parameters; otherwise, you can leave default values for parameters.

After you provision the product, you can see the output and find details of the S3 bucket and DMS/AWS Glue roles that can access the newly created bucket. Make a note of the output, as you need the S3 bucket information and IAM roles in subsequent steps.

Step 6: Launch a DMS task

Follow the process identical to Step 3 and provision a product from DMS Task product. When you provision the DMS task, on the Parameters page:

  • Specify the S3 bucket from the output that you noted earlier.
  • Specify servername as the server endpoint of your RDS database (for example,
  • Specify data as bucketfolder.
  • Specify mydb as the database.
  • Specify S3TargetDMSRole from the output you noted earlier.
  • Specify Private subnet 1 as DBSubnet1.
  • Specify Private subnet 2 as DBSubnet2.
  • Specify S3TargetDMSRole from the output you noted earlier.
  • Specify the dbUsername and dbPassword you noted after creating the RDS database.

Next, open the tasks section of DMS console and locate the newly created task; its status should read Ready. Select the task and choose Restart/Resume. This starts your DMS replication task, which hydrates the S3 bucket you specified earlier with the extract of database chosen.

I granted the my_service_catalog_end_user IAM role additional permissions – dms:StartReplicationTask and dms:StopReplicationTask, to allow users to start and stop DMS tasks. This shows how you can combine minimal permissions outside AWS Service Catalog to enable your users to perform tasks.

After the task completes, its status changes to Load Complete and the S3 bucket you created earlier now contains files filled with data from your database.

Step 7: Launch an AWS Glue crawler task

Now that you have hydrated your data lake with sample data, you can run AWS Glue crawler to populate your AWS Glue Data Catalog. To do so, follow the process outlined in Step 3 and provision a product from AWS Glue crawler. On the Parameters page, specify the following parameters:

  • For S3Path, specify the complete path to your S3 bucket, for example: s3://<your_bucket_name>/data
  • Specify IAMRoleARN as the value of GlueCrawlerIAMRoleARN from the output you noted at the end of Step 5.
  • Specify the DatabaseName as mydb.

Next, open the crawlers section of the AWS Glue catalog to locate a crawler created by the AWS Service Catalog. The crawler’s status should read Ready. Select the task and then choose Run crawler. When the crawler finishes, you can review what data populated your database from the database console.

As shown in following diagram, you can select the mydb database and see tables to explore the AWS Glue Data Catalog populated by the AWS Glue crawler.

The principles discussed in this post can be extended to logs, streams, and files. You can use various BI tools to extract useful knowledge from your data lake. For information about how to visualize your data, see Harmonize, Query, and Visualize Data from Various Providers using AWS Glue, Amazon Athena, and Amazon QuickSight. You can query your data lake from Amazon SageMaker notebooks. For more information, see Access Amazon S3 data managed by AWS Glue Data Catalog from Amazon SageMaker notebooks.


AWS Service Catalog enables you to build and distribute catalogs of IT services to your organization. In this post, I demonstrated how you can set up a catalog that lets your users vend tools to support creation and hydration of data lakes and maintain tight security standards. You can extend this idea of self-service by supporting resources such as DynamoDB databases, Kinesis Data Firehose delivery streams, Amazon Redshift clusters, and Amazon SageMaker notebooks, granting your users more flexibility and utility in their data lakes within guardrails you define.

If you have questions about implementing the solution described in this post, you can start a new thread on the AWS Service Catalog Forum or contact AWS Support.

About the Author

Kanchan Waikar is a Senior Solutions Architect at Amazon Web Services. She enjoys helping customers build architectures using AWS Marketplace for machine learning, AWS Service catalog, and other AWS services.

from AWS Management Tools Blog

Analyzing Amazon VPC Flow Log data with support for Amazon S3 as a destination

Analyzing Amazon VPC Flow Log data with support for Amazon S3 as a destination

In a world of highly distributed applications and increasingly bespoke architectures, data monitoring tools help DevOps engineers stay abreast of ongoing system problems. This post focuses on one such feature: Amazon VPC Flow Logs.
In this post, I explain how you can deliver flow log data to Amazon S3 and then use Amazon Athena to execute SQL queries on the data. This post also shows you how to visualize the logs in near real-time using Amazon QuickSight. All these steps together create useful metrics to help synthesize and analyze the terabytes of flow log data in a single, approachable view.
Before I start explaining the solution in detail, I review some basic concepts about flow logs and Amazon CloudWatch Logs.

What are flow logs, and why are they important?
Flow logs enable you to track and analyze the IP address traffic going to and from network interfaces in your VPC. For example, if you have a content delivery platform, flow logs can profile, analyze, and predict customer patterns of the content access, and track down top talkers and malicious calls.

Some of the benefits of flow logs include:

  • You can publish flow log data to CloudWatch Logs and S3, and query or analyze it from either platform.
  • You can troubleshoot why specific traffic is not reaching an instance, which helps you diagnose overly restrictive security group rules.
  • You can use flow logs as an input to security tools to monitor the traffic reaching your instance.
  • For applications that run in multiple AWS Regions or use multi-account architecture, you can analyze and identify the account and Region where you receive more traffic.
  • You can predict seasonal peaks based on historical data of incoming traffic.

Using CloudWatch to analyze flow logs
AWS originally introduced VPC Flow Logs to publish data to CloudWatch Logs, a monitoring and observability service for developers, system operators, site reliability engineers, and IT managers. CloudWatch integrates itself into more than 70 log-generating AWS services—such as Amazon VPC, AWS Lambda, and Amazon Route 53, providing you a single place to monitor all your AWS resources, applications, and services that run on AWS and on-premises servers.

CloudWatch Logs publishes your flow log data to a log group, with each network interface generating a unique log stream in the log group. Log streams contain flow log records. You can create multiple flow logs that publish data to the same log group. For example, you can use cross-account log data sharing with subscriptions to send multiple flow logs from different accounts in your organization to the same log group. This lets you audit accounts for real-time intrusion detection.

You can also use CloudWatch to get access to a real-time feed of flow logs events from CloudWatch Logs. You can then deliver the feed to other services such as Amazon Kinesis, Kinesis Data Firehose, or AWS Lambda for custom processing, transformations, analysis, or loading to other systems.

Publishing to S3 as a new destination
With the recent launch of a new feature, flow logs can now be directly delivered to S3 using the AWS CLI or through the Amazon EC2 or VPC consoles. You can now deliver flow logs to both S3 and CloudWatch Logs.

CloudWatch is a good tool for system operators and SREs to capture and monitor the flow log data. But you might want to store copies of your flow logs for compliance and audit purposes, which requires less frequent access and viewing. By storing your flow log data directly into S3, you can build a data lake for all your logs.

From this data lake, you can integrate the flow log data with other stored data, for example, joining flow logs with Apache web logs for analytics. You can also take advantage of the different storage classes of S3, such as Amazon S3 Standard-Infrequent Access, or write custom data processing applications.

Solution overview
The following diagram shows a simple architecture to send the flow log data directly to an S3 bucket. It also creates tables in Athena for an ad hoc query, and finally connects the Athena tables with Amazon QuickSight to create an interactive dashboard for easy visualization.

Now I show you the steps to move flow log data to S3 and analyze it using Amazon QuickSight.

The following steps provide detailed information on how the architecture defined earlier can be deployed in minutes using AWS services.

1. Create IAM policies to generate and store flow logs in an S3 bucket.
2. Enable the new flow log feature to send the data to S3.
3. Create an Athena table and add a date-based partition.
4. Create an interactive dashboard with Amazon QuickSight.

Step 1: Create IAM policies to generate and store flow logs in an S3 bucket
Create and attach the appropriate IAM policies. The IAM role associated with your flow log must have permissions to publish flow logs to the S3 bucket. For more information about implementing the required IAM policies, see the documentation on Publishing Flow Logs to Amazon S3

Step 2: Enable the new flow log feature to send the data to S3
You can create the flow log from the AWS Management Console, or using the AWS CLI.

To create the flow log from the Console:

1. In the VPC console, select the specific VPC for which to generate flow logs.

2. Choose Flow Logs, Create flow log.

3. For Filter, choose the option based on your needs. For Destination, select Send to an S3 bucket. For S3 bucket ARN*, provide the ARN of your destination bucket.

To create the flow log from the CLI:

1. Use the following example command to return the flow log in JSON format:

186590dfd865:~ avijitg$ aws ec2 create-flow-logs --resource-type VPC --resource-ids <your VPC id> --traffic-type <ACCEPT/REJECT/ALL>  --log-destination-type s3 --log-destination <Your S3 ARN> --deliver-logs-permission-arn <ARN of the IAM Role>
    "ClientToken": "gUk0TEGdf2tFF4ddadVjWoOozDzxxxxxxxxxxxxxxxxx=",
    "FlowLogIds": [
    "Unsuccessful": []

2. Check the status and description of the flow log by running the following command with a filter and providing the flow log ID that you received during creation:

186590dfd865:~ avijitg$ aws ec2 describe-flow-logs --filter "Name=flow-log-id,Values="fl-xxxxxxx""
    "FlowLogs": [
            "CreationTime": "2018-08-15T05:30:15.922Z",
            "DeliverLogsPermissionArn": "arn:aws:iam::acctid:role/rolename",
            "DeliverLogsStatus": "SUCCESS",
            "FlowLogId": "fl-xxxxxxx",
            "FlowLogStatus": "ACTIVE",
            "ResourceId": "vpc-xxxxxxxx",
            "TrafficType": "REJECT",
            "LogDestinationType": "s3",
            "LogDestination": "arn:aws:s3:::aws-flowlog-s3"

3. You can check the S3 bucket and ensure that your flow logs output correctly with the following specific structure:


Step 3: Create an Athena table and add a date-based partition
In the Athena console, create a table on your flow log data.

Use the following DDL to create a table in Athena:

  version int,
  account string,
  interfaceid string,
  sourceaddress string,
  destinationaddress string,
  sourceport int,
  destinationport int,
  protocol int,
  numpackets int,
  numbytes bigint,
  starttime int,
  endtime int,
  action string,
  logstatus string
PARTITIONED BY (dt string)
LOCATION 's3://<your bucket location with object keys>/'
TBLPROPERTIES ("skip.header.line.count"="1");

After creating the table, you can partition it based on the ingestion date. Doing this helps speed queries of the flow log data for specific dates.

Be aware that the folder structure created by a flow log is different from the Hive partitioning format. You can manually add partitions and map them to portions of the keyspace using ALTER TABLE ADD PARTITION. Create multiple partitions based on the ingestion date.

Here is an example with a partition for ingestion date 2019-05-01:

ALTER TABLE vpc_flow_logs  ADD PARTITION (dt = '2019-05-01') location 's3://aws-flowlog-s3/AWSLogs/<account id>/vpcflowlogs/<aws region>/2019/05/01';

Step 4: Create an interactive dashboard with Amazon QuickSight
Now that your data is available in Athena, you can quickly create an Amazon QuickSight dashboard to visualize the log in near real time.

First, go to Amazon QuickSight and choose New Analysis, New datasets, Athena. For Data Source Name, enter a name for your new data source.

Next, for Database: contain sets of tables, choose your new table. Under Tables: contain the data you can visualize, select the source to monitor.

You can start creating dashboards based on the metrics to monitor.

In the past, to store flow log data cost-effectively, you had to use a solution involving Lambda, Kinesis Data Firehose, or other sophisticated processes to deliver the logs to S3. In this post, I demonstrated the speed and ease of importing flow logs to S3 using recent VPC updates and Athena to satisfy your analytics needs. For more information about controlling and monitoring your flog logs, see the documentation on working with flow logs.

If you have comments or feedback, please leave them below, or reach out on Twitter!


Amazon CloudWatch
Documentation on CloudWatch Logs
Amazon VPC Flow Logs can now be delivered to S3

About the Author

Avijit Goswami is a Sr. Solutions Architect helping AWS customers to build their Infrastructure and Application on Cloud conforming to AWS Well Architected methodologies including Operational Excellence, Security, Reliability, Performance, and Cost Optimizations. When not at work, Avijit likes to travel, watch sports and listening to music.



from AWS Management Tools Blog

Automating life-cycle management for ephemeral resources using AWS Service Catalog

Automating life-cycle management for ephemeral resources using AWS Service Catalog

Enterprises deploy AWS resources and services daily to support different business objectives.

For example:

  • A data scientist might like to create an EMR cluster for a job that should not take longer than one week.
  • A sales engineer needs a demo environment for two days.
  • A marketing application owner wants a marketing application to run for nine weeks.
  • A QA engineer would like to run a QA task for five days.

All of these tasks have time bounds and definitions that end users would like to manage and remove these resources when they are no longer needed. To do this, they need a process that automatically terminates unneeded resources.

In this post, you will learn how to enable end users to pick expiration times for their deployed resources on AWS. Additionally, you learn how to set a hard time limit of 30 days on these ephemeral resources.

This solution uses the following AWS services. Most of the resources are set up for you with an AWS CloudFormation stack:


Here are some of AWS Service Catalog concepts referenced in this post. For more information, see Overview of AWS Service Catalog.

  • A product is a blueprint for building the AWS resources to make available for deployment on AWS, along with the configuration information. Create a product by importing an AWS CloudFormation template, or, in case of AWS Marketplace-based products, by copying the product to AWS Service Catalog. A product can belong to multiple portfolios.
  • A portfolio is a collection of products, together with the configuration information. Use portfolios to manage user access to specific products. You can grant portfolio access for an AWS Identity and Access Management (IAM) user, IAM group, or IAM role level.
  • A provisioned product is an AWS CloudFormation stack; that is, the AWS resources that are created. When an end-user launches a product, AWS Service Catalog provisions the product from an AWS CloudFormation stack.
  • Constraints control the way that users can deploy a product. With launch constraints, you can specify a role that AWS

Solution overview­

The following diagram maps out the solution architecture.



Solution flow description

This solution uses AWS Service Catalog for product creation, Lambda and CloudWatch for the termination process, and DynamoDB for state management. It  also uses SES for notifications to the end user.

1. Admin experience

The AWS administrator uses the AWS CloudFormation console to launch the setup template provided as part of this post.

As part of the deployment process, AWS CloudFormation creates the following resources:

  • A CloudWatch rule
  • A Lambda function
  • A sample AWS Service Catalog portfolio with a sample AWS Service Catalog product
  • A DynamoDB table

2. End-user experience

The end user could be a data scientist or QA engineer. They log in to the AWS Service Catalog console, pick a product such an EMR cluster or a product from the Service Catalog reference architecture, choose how long the product should be active, and launch the product.

Behind the scenes, invisible to the end user, CloudWatch detects that the product being deployed is configured for automatic termination and triggers a Lambda function. The Lambda function reads the product information and stores the info in a DynamoDB table.

3. Scheduled process

Invisible to the end user, a CloudWatch rule is triggered at certain intervals selected by the AWS administrator during the setup process. It triggers a Lambda function.

The Lambda function queries the DynamoDB table and gets a list of AWS Service Catalog products that have reached the end of their subscription time. It terminates those products.

The owner of the AWS Service Catalog product is notified through SES that the service has been terminated.


  1. Configure an environment.
  2. Launch the end user Service Catalog product with auto termination.
  3. Schedule a process to check for the provisioned product end time.
  4. Configure AWS Service Catalog products with auto termination.

Configure an environment

Use an Amazon S3 bucket to upload your configuration files from AWS CloudFormation and Lambda.

To get the setup material:

  1. Download the file with the configuration content.
  2. Unzip the contents and save them to a folder. Note the folder’s location.

To create your S3 bucket:

  1. Log in to your AWS account as an administrator. Ensure that you have an AdministratorAccess IAM policy attached to your login because you’re going to create AWS resources, including IAM roles and users.
  2. In the S3 console, create a bucket. Leave the default values except as noted.
    • For Bucket name, enter scautoterminate-<accountNumber>.

To upload content to the new bucket:

  1. In the S3 console, select your new bucket, and choose Upload, Add files.
  2. Navigate to the folder that contains the configuration content. Select all the files and choose Open. Leave the default values except as noted.
  3. After the Review page, from the list of files, select the sc_admin_setup_autoterminate.json file.
  4. Right-click the link under Object URL and choose Copy link address.

To launch the configuration stack:

  1. In the AWS CloudFormation console, choose Create Stack, Amazon S3 URL, paste the URL that you just copied, and then choose Next.
  2. On the Specify stack details page, specify the following:
    • Stack name: scautoterminateSetup-<accountNumber>
    • S3Bucket: scautoterminate-<accountNumber>
    • SCEndUser: The current user name.
    • CheckFrequency: The interval of when to check for products end of life (10,20,30) minutes
  3. On the Review page, check the box next to I acknowledge that AWS CloudFormation might create IAM resources with custom names, and choose Create.
  4. After the status of the stack changes to CREATE COMPLETE, select the stack and choose Outputs to see the output.
  5. Find the ServiceCatalog entry and copy the URL value.

Congratulations! You have completed the setup. Now, test it by following the launch the end-user Service Catalog product with auto termination process.

Launch the end-user Service Catalog product with auto termination

To launch an AWS Service Catalog product with auto termination:

  1. Log in to the AWS Management Console as the SCEndUser.
  2. In the AWS Service Catalog console, in the left navigation pane, choose Products list.
  3. On the Products list page, choose the AWS Service Catalog sample product BucketAutoTerminate that was configured during the setup process.
  4. Choose Launch product.
  5. On the Product Version page, for Name, enter scbucket10min, and choose Next.
  6. On the Parameters page, set the following values:
    • AutoTerminate – True.
    • DurationDays – How long the product should run in days.
    • DurationHours – How long the product should run in hours.
    • DurationMinutes – How long the product should run in minutes.
    • Action – Terminate or Notify.
    • ContactEmail – Your email address.
    • BucketName – The bucket name (use the default, as the system always creates a unique name).
  7. On the TagOptions page, choose Next.
  8. On the Review page, choose Launch.


Congratulations, you have deployed an AWS Service Catalog product with an auto termination configuration.

The product has now been successfully deployed. After the product is launched, the end user is sent an email with the end time information of the provisioned product, as in the following image.

Also, invisible to the end user, a CloudWatch rule is triggered that detects when the product being launched is configured for auto termination and stores the product information in a DynamoDB table. This information is used later when the scheduled process is executed.

Schedule a process to check for the provisioned product end time

An automated process is triggered at the interval chosen at the setup time. The process checks the current time against the end time of the provisioned products. If the process finds a provisioned product that has reached its end time, the process performs one of the following actions:

  • If the product is configured with an end action of Notify, the end user gets an email notification and the product remains active.
  • If the product is configured with an end action of Terminate, the product is terminated and the end user gets an email notification, similar to the following image.

Configure AWS Service Catalog products with auto termination

Here’s how you configure products with auto termination:

  • Create or modify the existing CloudFormation template.
  • Create a new AWS Service Catalog product version

Create or modify the existing CloudFormation template

With a text editor, edit the CloudFormation template on which the AWS Service Catalog product is based.

Add the following parameters and Metadata section to the file, and save the file.

"ContactEmail": {
      "Default": "[email protected]",
      "Description": "Email of owner",
      "Type": "String"
    "DurationDays": {
      "Default": "0",
      "AllowedValues": [
      "Description": "Duration Days",
      "Type": "String"
    "DurationHours": {
      "Default": "0",
      "AllowedValues": [
      "Description": "Duration Hours",
      "Type": "String"
    "DurationMinutes": {
      "Default": "10",
      "AllowedValues": [
      "Description": "Duration Minutes",
      "Type": "String"
    "AutoTerminate": {
      "Default": "True",
      "AllowedValues": [
      "Description": "Should the product automatically terminate?",
      "Type": "String"
    "Action": {
      "Default": "Terminate",
      "AllowedValues": [
      "Description": "What action should be taken at the end of the service?",
      "Type": "String"
  "Metadata": {
    "AWS::CloudFormation::Interface": {
      "ParameterGroups": [
          "Label": {
            "default": "End Information"
          "Parameters": [

Create a new AWS Service Catalog product version

  1. Log in to the AWS Service Catalog console as an AWS Service Catalog admin user.
  2. In the left navigation pane, choose Product list and select the product to which to add a new version.
  4. Choose Browse and select the file that you saved earlier.
  5. For Version title, enter “V9 auto terminate”.
  6. For Description, enter a description.
  7. Choose SAVE.

You can now deploy the auto-terminate version by following the instructions in Step 2.

Cleanup process.

To avoid incurring cost, please delete resources that are not needed. You can terminate the Service Catalog product deployed the by selecting Action then, Terminate.


In this post, you learned an easy way to terminate resources automatically on AWS with AWS Service Catalog. You also saw how there’s an extra layer of governance and control when you use AWS Service Catalog to deploy resources to support business objectives.

About the Author

Kenneth Walsh is a New York-based Solutions Architect focusing on AWS Marketplace. Kenneth is passionate about cloud computing and loves being a trusted advisor for his customers. When he’s not working with customers on their journey to the cloud, he enjoys cooking, audio books, movies, and spending time with his family and dog.


from AWS Management Tools Blog

Detect and remediate issues faster with AWS Systems Manager OpsCenter and Moogsoft AIOps

Detect and remediate issues faster with AWS Systems Manager OpsCenter and Moogsoft AIOps

AWS Systems Manager, the operational hub for AWS and hybrid cloud deployments, recently announced the launch of OpsCenter to help you view, investigate, and resolve operational issues related to your environment from a central location. OpsCenter presents operational issues in a standardized view, along with contextually relevant data, and associated Systems Manager Automation documents, enabling easier diagnosis and remediation.

This post focuses on the new native integration between OpsCenter and Moogsoft AIOps. Moogsoft, an AWS partner, is doing pioneering work in building out an AI platform for IT operations. They’re focusing on reducing the signal-to-noise ratio, suggesting root causes, and enabling cross-team collaboration to solve incidents faster.

You can now take advantage of this native integration between OpsCenter and Moogsoft AIOps to further enhance productivity for your DevOps engineers. The benefits include:

● Noise reduction using OpsItems deduplication logic and AI to filter out OpsItem noise automatically and cluster related OpsItems into Moogsoft Situations.

● Faster remediation of OpsItems using contextual investigative data and Moogsoft’s root cause and collaboration features.

● Reduction in incidents by automating remediation workflows and using machine learning. Moogsoft customers have experienced reduction by up to 60% in many cases.

OpsCenter and Moogsoft AIOps integration

Moogsoft AIOps together with OpsCenter enables you to reduce mean time to resolution (MTTR) and stay focused on innovation projects instead of ongoing operational firefighting.

OpsCenter provides an open API to ingest any event data across the entire AWS service stack. This removes the burden from end users of gathering disparate event data across compute, storage, network, and other critical AWS services.

OpsCenter together with Moogsoft AIOps reduces MTTR by automating the event-to-resolution workflow using AI and machine learning. The workflow reduces event noise, clusters similar alerts into Situations, provides probable root cause analysis, and enables collaboration by integrating with ITSM, notification, orchestration, and remediation systems.

The value of combining OpsCenter and Moogsoft is multifaceted. OpsCenter is the aggregation point for operational issues across various AWS services and then enabling contextual investigation and remediation actions. Moogsoft industrializes the ingestion of OpsItem data and then uses Moogsoft AIOps to surface critical IT incidents, cluster-related OpsItems, and offering teams with collaboration workflow to remediate what’s wrong. The following diagram shows the data flow and points of integration.



“At Moogsoft, we are impressed with the AWS Systems Manager OpsCenter feature set and open framework. It’s easy for our customers to combine it with our Moogsoft AIOps data science approach, and achieve a powerful, modern combined approach to Service Assurance, both on premises and in the cloud,” says Dave Casper, CTO, Moogsoft.

Auto Scaling failure: An OpsCenter and Moogsoft AIOps use case

The following use case describes how OpsCenter and Moogsoft AIOps help reduce MTTR and provide an event-to-resolution workflow.

Imagine the scenario occurs on Black Friday — the biggest shopping day of the year. An online retailer is a digital native company running its critical services within the AWS Cloud. The retailer created a digital native application that leverages AWS services for high availability, redundancy, and performance. The scenario includes a looming disaster as Auto Scaling failed to expand the compute layer while demand quickly spiked during the holiday rush. It turns out that Auto Scaling failed due to human error!

This failure in the customer account created an OpsItem in OpsCenter. Moogsoft industrialized the ingestion of all incoming events, including those from OpsCenter and a new Situation was created. Moogsoft AI analytics algorithms identified the new Situation as similar to a prior Situation. An operator checked the past Situation and learned the Amazon Machine Image (AMI) name referenced by the Auto Scaling group was not found.

The remediation step for the past similar OpsItem included running an AWS Systems Manager Automation document to correct the configuration error in the Auto Scaling group and update the AMI name.

As part of the initial analysis, the Probable Root Cause (PRC) was presented to the operator. While several alerts were clustered together, the Moogsoft Situation Room console displayed a mixture of root cause and symptomatic alerts.

The PRC for this specific Situation was identified as an alert on an Auto Scaling failure.

The information provided to the operator for similar Situations and PRC helped reduce the mean-time-to-troubleshoot the problem. Further, OpsCenter recommended running an SSM Automation document that was previously used to resolve the issue. Together with a recommended remediation action in OpsCenter, the operator reduced their entire mean-time-to-resolution.

The Timeline view in Moogsoft AIOps showed the evolution of how the Situation’s alerts arrived, including the order, frequency, and severity.

The Timeline view showed the story of how the compute stack started having degraded resource performance. At the same time and in regular intervals, Amazon CloudWatch was informing Moogsoft of Auto Scaling launch failures. As time progressed, the severity of the compute stack resource usage increased from minor, major, and then to critical status. The last few messages originated from the application monitoring tool regarding slow application response time and application connection refusals.

Two-way integration

As illustrated in the data flow diagram, OpsCenter and Moogsoft AIOps have a two-way integration. When a Situation is created, Moogsoft AIOps creates a new OpsItem in OpsCenter. The new Situation-based OpsItem contains all the relevant information about related OpsItems that are clustered together. The following screenshot shows the new Situation-based OpsItem and its description detail.


The combination of AWS OpsCenter and Moogsoft AIOps brings economic value to your operations by reducing time to resolution and improving service delivery of critical applications

About the authors

Craig Yenni is a Strategic Architect at Moogsoft, focused on the joint success of channels and alliances. He has been engaged in technology for over 20 years. This journey has taken him down the path to application development, operations, architecture, engineering, sales, and consulting.



Sid Arora is the lead product owner on AWS Systems Manager OpsCenter. Sid has led building multiple products at and Amazon Web Services over the last 7+ years. His passions include leveraging machine learning and artificial intelligence to simplify and personalize user experiences, across consumer, enterprise, and cloud operations products.

from AWS Management Tools Blog

Use Atlassian Opsgenie with AWS Systems Manager to run the EC2Rescue tool

Use Atlassian Opsgenie with AWS Systems Manager to run the EC2Rescue tool

On-call engineers are responsible for responding to alerts, troubleshooting high priority incidents, and taking action to remediate issues. Automation tools like AWS Systems Manager and Atlassian Opsgenie can help these engineers by reducing repetitive work and allowing them to focus on the most important tasks. In this blog post, Merve Bolat, Associate Product Manager at Opsgenie, Atlassian, explains how Opsgenie can execute an automation document in AWS Systems Manager in response to an incoming alert.

Opsgenie and EC2Rescue tool

Opsgenie is a modern alert and incident management platform from Atlassian that empowers on-call responders to centralize alerts, notifies the right people reliably, and enables them to take rapid action. Now you can directly integrate Opsgenie with AWS Systems Manager to quickly execute automation documents without leaving the Opsgenie console or mobile app.

EC2Rescue is an AWS troubleshooting tool that you can run on your Amazon EC2 instances to resolve operating system-level issues and collect advanced logs and configuration files for further analysis. In the example in this blog post, Opsgenie will be monitoring for a Status Check Failure alert from Amazon CloudWatch, which is a sign that an EC2 instance needs attention. When this alert is received, an action policy will trigger EC2Rescue using the AWSSupport-ExecuteEC2Rescue automation document that comes standard in AWS Systems Manager. The following screenshots show Opsgenie receiving an EC2 StatusCheckFailed alert.

Configuring the EC2Rescue Action in Opsgenie

To create an automation action on Opsgenie, you need a corresponding action channel for AWS Systems Manager. In the Actions tab of your team’s dashboard in the Opsgenie console, create an AWS Systems Manager channel with your AWS account ID, AWS Region, and AWS Identity and Access Management (IAM) role. Multiple actions can be created by using the same template as long as the account ID, Region, and role are compatible with the automation document.

After the automation template is created, you can add the related automation action from the Manage Actions section. Specify the name of action, select AWS Systems Manager as the type, and choose the action channel you created in the previous step. Then, select the AWSSupport-ExecuteEC2Rescue document from the AWS Systems Manager documents (SSM Documents) section. You can search for the document from the drop-down list or simply type the name of the document in the search box.

The next section lists the parameters that can be configured for the action. Opsgenie imports the parameters of the corresponding automation document of AWS Systems Manager directly. Parameters that are marked as “required” are mandatory for execution. For the EC2Rescue tool, UnreachableInstanceId must be provided, whereas LogDestination, EC2RescueInstanceType, SubnetId, and AssumeRole are optional.

Note: We recommend that you provide an Amazon S3 bucket name as the LogDestination, so that diagnostic information and OS level logs will be uploaded to S3. That helps in case the AWS Systems Manager Automation document doesn’t fix the issue and manual investigation is necessary at that point. The default value for SubnetId is determined as CreateNewVPC. However, you should consider VPC limit and access restrictions.

To manually execute an automation action, the related action needs to be added to the alerts using alert policies or integration rules (which can be done from the Advanced Settings of the integration). Since Amazon CloudWatch is a convenient tool to track the status of EC2 instances, you can add the *EC2Rescue action on the Amazon CloudWatch Events integration in Opsgenie. This way, whenever an Amazon EC2-related alert is created by Amazon CloudWatch Events integration on Opsgenie, you can easily execute the action from the alert itself.

To add the EC2Rescue action on Amazon CloudWatch Events integration, switch to the Advanced settings of the integration. In the Create Alert section, type EC2Rescue in the Actions field, then save the integration.

Configuring permissions in AWS

EC2Rescue needs specific permissions and trust entities to perform the automation actions. You can either create a role by using an AWS CloudFormation template with the minimum required permission policies and trusted entities during action channel creation or add an IAM role using the AWS Management Console. The IAM role must start with the prefix opsgenie-automation-actions- to execute an action. If you have administrator access, you can easily run the action with the role trust policy document that follows. Otherwise, you might need to contact your account admin to configure the necessary permissions and trust entities. For further details, refer to the AWS Systems Manager User Guide. The following screenshot shows configuring a role in the AWS Management Console.


Executing the EC2Rescue Action in Opsgenie

After the permission configurations are done on AWS Systems Manager, the EC2Rescue action can be executed in the Opsgenie console. A window will appear when you choose the action to execute on the related alert.


If all the required parameters have predefined values, you can directly choose the Execute button. Otherwise, you need to give values for the parameters before executing.

After you choose Execute, the AWSSupport-ExecuteEC2Rescue automation document will be run on the Instance that you specified. If the alarm condition is cleared, Opsgenie can automatically resolve the alert. You can view the results of the EC2Rescue operation in the Opsgenie alert activity log and track the operation steps in the AWS Management Console using the execution ID provided.



EC2Rescue is just one example of how Opsgenie and AWS Systems Manager can help on-call engineers respond to alerts and resolve issues faster. By enabling alert responders to execute any automation document, the troubleshooting and remediation steps that are normally manual tasks can be automated and triggered during an incident. To try Opsgenie Actions for Incident Response with AWS Systems Manager, visit

The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

from AWS Management Tools Blog

Get visibility into application health with Amazon CloudWatch Application Insights for .NET and SQL Server

Get visibility into application health with Amazon CloudWatch Application Insights for .NET and SQL Server

To provide a reliable service to your customers, you need to make sure that your business-critical applications are healthy. If you have ever been involved in the monitoring process, you’re probably already aware of its complexity. You need to identify and configure the right set of monitors for various parts of your application and infrastructure, and set up appropriate thresholds to determine potential problems. You visualize, analyze, and correlate relevant pieces of information to get to the root cause and fix the problem, before it affects your end users.

As a Windows DevOps engineer, you are also required to build additional expertise into effectively monitoring important parts of your technology stack, such as Microsoft IIS Server and SQL Server. For example, think of a situation where you get informed of an issue related to your application crashing and blocking critical tasks. To figure out what’s happening, you look at your dashboards to see if any of the metrics that you are monitoring are at an alarming state. You notice that there are 4XX and 5XX errors on your frontend Application Load Balancer and high empty responses in your SQS queues.

With that incomplete information, you try to get the root cause by sifting through your IIS, application, and SQL Server logs, only to realize that your database wasn’t backing up. While tackling this issue, you conclude that this process makes detecting application problems and keeping applications healthy laborious and time-consuming. Sound familiar?

Amazon CloudWatch Application Insights for .NET and SQL Server

To make monitoring and troubleshooting process easy while offering visibility into your application health, AWS has launched Amazon CloudWatch Application Insights for .NET and SQL Server, enabling you to set up, monitor, and analyze the metrics, logs and alarms for your .NET applications and underlying resources to detect problems. It is designed to provide an easy getting started experience. It recognizes and sets up important monitors, intelligently detects application problems using Amazon SageMaker, and creates Amazon CloudWatch automated dashboards for detected problems, summarizing the problem details, including problem severity, source, related alarms, and log errors.

The insights provided help you isolate the ongoing issue faster and reduce the mean time to resolution. For detected problems, you can take remedial actions, add your feedback for the analysis that would in turn improve the feature’s future findings, or choose to debug further.

Detecting and troubleshooting application problems

Using the previous example, we can see the problem detection and troubleshooting process using CloudWatch Application Insights for .NET and SQL Server. By default, it generates Amazon CloudWatch Events when it detects a problem, so that you can be notified.

To view more details about the detected problem, go to CloudWatch console to see the list of alarms firing off due to this problem. These are the alarms that are dynamically set and updated by CloudWatch Application Insights. An example dashboard is shown in the following screenshot.

CloudWatch console showing a summary of your account


Under Application Insights for .NET and SQL Server section, you see that the problem with the application is due to the SQL Server transaction log being full.

CloudWatch Application Insights for .NET and SQL ServerTo view more details, select the problem description. An automated dashboard contextualizes the problem, as shown in the following screenshot.

CloudWatch Application Insights for .NET and SQL Server provides a summary of what’s really happening with the application. You get an overview of the type of problem, severity, probable source of the problem, and additional insights with potential next steps.

It also analyzes and groups together all the individual metric alarms, such as 4XX and 5XX errors on the web frontend. It pulls up log errors from the IIS and SQL Server error logs in the dashboard, so that you don’t have to deal with individual errors or correlate them to get to a conclusion and resolution. The following screenshot shows an example correlation.

Related alarms

In this example, the SQL Server transaction log was full due to a missing or broken back up. To remediate this issue, run a SQL Server backup using AWS Systems Manager Automation.

Onboarding your application

Getting started with CloudWatch Application Insights for your .NET and SQL Server applications is easy. In the CloudWatch console Settings, choose Application Insights for .NET and SQL Server, and then Add Application.

Getting started

Next, select an AWS resource group that is running your application, including the resources running your backend SQL Server cluster, frontend IIS web servers and.NET worker-tier.

Resource group selection

After you select the resource group, CloudWatch displays a list of potential application components, making it easy to set up monitors for similar resources and to group insights. These components are resources or groups of resources that are running a particular function or sub-service in your application.

For example, your .NET application has a set of EC2 instances in an Auto Scaling group, all of which are listed as an auto-grouped component. You can choose similar monitors for all of them and view them together.


You can also create custom components by grouping similar resources.

To enable monitors for your application, you now can select specific application components and manage monitoring for them.

When you tell CloudWatch Application Insights the logical application tier represented by the selected component, it offers you a recommended, customizable list of logs and metrics for underlying resources.


After you have saved monitors for your relevant application components, CloudWatch automatically starts setting these up for you. You don’t need to log into the instance and update the CloudWatch Agent Configuration file manually.

It also sets up relevant alarms for the selected metrics using CloudWatch alarms and automatically updates them based on past metric patterns and how your application has been using underlying resources.


In fact, when your application scales up at any point, CloudWatch automatically takes care of monitoring of the added resources for you!

Get started right away

CloudWatch Application Insights for .NET and SQL Server is now available in all commercial AWS Regions. There is no price for the monitoring assistance and analysis, but you pay the usual price for the monitoring data (metrics, logs, and alarms) as per public CloudWatch pricing.

For more information, see Getting Started with Amazon CloudWatch Application Insights for .NET and SQL Server.


About the author

Purvi Goyal is a senior product manager with the Amazon EC2 team, where she strives to enhance the cloud experience of AWS Enterprise customers. Outside of work, she enjoys outdoor activities like hiking and kayaking.



from AWS Management Tools Blog

Introducing Service Quotas: View and manage your quotas for AWS services from one central location

Introducing Service Quotas: View and manage your quotas for AWS services from one central location

By Lijo George

Today we are introducing Service Quotas, a new AWS feature which enables you to view and manage your quotas, also known as limits, from a central location via the AWS console, API or the CLI.

Service Quotas is a central way to find and manage service quotas, an easier way to request and track quota increases, and a simplified way to request quota increases for new accounts created in AWS Organizations. For supported services, you can proactively manage your quotas by configuring Amazon CloudWatch alarms that monitor usage and alert you to approaching quotas. You can access Service Quotas by visiting directly or by searching for it in the AWS management console.

Going forward, we will be referring to service quotas (instead of service limits) to better represent our philosophy of providing you with better control of your AWS resources. Please note that you may encounter both terms being used interchangeably.


  1. Central management of AWS service quotas: Using Service Quotas, you can manage your AWS service quotas in one central location, eliminating the need to go to multiple sources or maintain your own list. You can access Service Quotas either through the console or programmatically via the API or CLI.
  2. Improved visibility of service quotas: You can view default and account-specific (applied) quotas for multiple AWS services. Applied quotas are overrides that are specific to your account that have been granted to you in the past. At launch, you can view default quotas for 90 AWS services, with more coming soon.
  3. Easier quota increase requests: You can request a quota increase through a single page on the console or through an API call. You simply search for a quota and put in your desired value to submit a quota increase request. You can also view and track the status of your requests.
  4. Paving the way for proactive quota management: Service Quotas integrates with CloudWatch to alert you when you reach a threshold, enabling you to proactively manage your quotas.
  5. Simplify quota requests for new accounts in AWS Organizations: Customers often request quota increases for new accounts that they create in their organization. Service Quotas automates this process so that you spend less time requesting increases for new accounts in your organization, and ensures that all your accounts are consistently configured in accordance with the needs of your workloads.

Using Service Quotas

Emma’s serverless world

Emma is a lead developer in a startup that does image processing for customers. Her company uses AWS Lambda extensively. They love the powerful real time file and stream processing capabilities that Lambda provides. Emma diligently ensures that her application is highly available and flexible to match user workloads. However, she’s concerned about unexpectedly hitting quotas. She previously spent time gathering data about Lambda to make sure her workflows did not stop growing.

To learn about Lambda quotas, Emma visits Service Quotas, navigates to the AWS Lambda page where she can view information about all the quotas for the service.

One quota that catches Emma’s attention is ‘Concurrent executions’. She clicks on the quota to see additional details, including her actual usage against the quota in the CloudWatch graph on the same page.

Emma expects her concurrent executions to increase and decides to puts in a request to increase it so as to have excess concurrency for her new workloads. Requesting a quota increase is a common task that is now made easier through Service Quotas. Emma can now click on ‘Request quota increase’ and fill out the quota increase form.

Once Emma finishes submitting her quota increase request, she can track it through the ‘Quota increase history’ page. Her quota request needed additional review, so it was sent to customer support. Emma can see this and find the support center case number on the quota increase detail page. Selecting the case redirects Emma to the customer support console, where Emma can communicate with the agent who is handling this case.

After requesting a quota increase, Emma also sets up a CloudWatch alarm to alert her when her application’s usage is 80% of the default quota so that she can monitor usage and request for increases, if needed.

Alan’s expanding AWS workloads

Alan is a Cloud Admin in a large financial services company. His task is to provision resources for his growing AWS fleet. Alan uses AWS Organizations to manage his 500 AWS accounts. Once Alan creates an account, he routinely requests quota increases to make sure the account is configured in a consistent manner.

With Service Quotas, Alan can accomplish the same goal with a pre-defined Service Quota template in his organizations’ master account.

Once this template is active, for each new account he creates in his organization, the same quota increase requests will be created. Alan can track the status of these requests in the ‘Quota increase history’ page of the newly created accounts.

Available Now

Currently, Service Quotas is available in US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Asia Pacific (Osaka), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), EU (Paris) and South America (São Paulo). Service Quotas is available free of cost – you will only be charged for the CloudWatch alarms you set up.


Service Quotas allows you to view and manage quotas for AWS services from one central location. Service Quotas enables you to easily raise and track quota increase requests, and integrates with AWS Organizations to save you time and effort in setting up quotas for new accounts in a consistent manner.

If you have questions about or suggestions for this solution, start a new thread on the Service Quotas forum.

from AWS Management Tools Blog