Category: Management Tools

Leveraging AWS CloudFormation to create an immutable infrastructure at Nubank

Leveraging AWS CloudFormation to create an immutable infrastructure at Nubank

Bruno Halley Schaefer, software engineer, Nubank
Hugo Carvalho, senior solutions architect, AWS
Marcelo Nunes, senior technical account manager, AWS Enterprise Support Team

 

Nubank, a Brazilian company that is one of the world’s largest independent digital banks, is innovatively transforming Latin America’s financial landscape by providing transparent, simple, and efficient services. The company fights complexity to empower people, give them back control of their finances, and redefine their relationships with money.

Nubank was born in the cloud and AWS has supported them on their journey since day one.

Nubank’s engineering team embraced functional programming ideas and immutability from the start. They apply immutability concepts to their microservices developed using Clojure and their data management and persistence using Datomic, and they handle their infrastructure with the help of AWS CloudFormation.

Overview

Before cloud computing became a widely available reality, building a server infrastructure was about dealing with expensive physical servers. Replacing those physical servers was so costly and time consuming that the most practical approach was to apply any necessary changes to the servers running in production. This is the nature of a mutable infrastructure, and all those in-place modifications eventually lead to critical problems like inconsistency, unreliability, and increasing complexity.

One of the biggest challenges in building infrastructure today is predictability. However, thanks to virtualization and cloud computing, it’s now possible to have new deployment workflows that can help companies address this challenge. One of those new workflows is based on the core idea of immutable infrastructure, in which no modification to a running server is allowed unless the server is completely replaced with a new instance that contains all the necessary changes.

Walkthrough

AWS CloudFormation provides a common language for you to describe and provision all the infrastructure resources in your cloud environment. It allows you to use a simple text file to model and provision, in an automated and secure manner, all the resources needed for your applications across all regions and accounts.

Infrastructure as code

Before we detail Nubank’s approach to immutable infrastructure, it’s essential to describe a fundamental aspect of how Nubank handles infrastructure: Everything has a code representation in the form of definition files.

The following is a simplified example of a definition. This definition contains the code representation for a Datomic transactor running on an Amazon EC2 Auto Scaling group.

 

{:name :ronaldo-datomic

:squad :platform

:environments {:staging #{:s0 :s1 :s2}}

:workload {:type :generic-legacy, :multiplier 1}

:storage {:type :dynamodb

:resource-name "ronaldo-datomic"}

:memory {:jvm-xmx "2500M"

:jvm-xms "2500M"

:object-cache-max "1g"

:memory-index-max "512m"}

:write-concurrency 5}

 

These definitions are Nubank’s source of truth, and their infrastructure should always reflect this source of truth.

Immutable infrastructure at Nubank

To achieve an immutable infrastructure, Nubank constantly creates and destroys cloud resources. Always following a blue-green pattern, they first create new resources containing all the modifications they wish to deploy and then drop the old resources after they verify that everything works as intended.

The following diagram shows the standard blue-green pattern followed by Nubank:

 

AWS CloudFormation helps Nubank throughout this entire process and is at the core of their immutable infrastructure and blue-green process.

This service provisions resources in a safe, repeatable manner, allowing you to build and rebuild your infrastructure and applications without having to perform manual actions or write custom scripts. AWS CloudFormation takes care of determining the correct operations to perform when managing your stack and rolls back changes automatically if errors are detected.

To take full advantage of the power provided by AWS CloudFormation, Nubank uses Nimbus, one of Nubank’s custom build tools. Nimbus, written in Clojure, is responsible for abstracting and automating Nubank’s interactions with AWS and translates Nubank definition files into AWS CloudFormation stacks.

The following diagram shows how Nimbus interacts with AWS CloudFormation to create infrastructure resources:

After Nubank uploads a template to AWS CloudFormation, the tool (AWS CloudFormation) takes care of creating your resources while respecting their interdependencies and enforcing an all-or-nothing operation: All resources are successfully created, or none of them are created at all.

Before deleting AWS CloudFormation stacks (thus excluding associated cloud resources), Nubank’s engineering team closely monitors the overall health of new resources as they gradually increase their load. To monitor their systems, Nubank relies heavily on Grafana dashboards and OpsGenie alerts: All built on top of Prometheus metrics.

These alerts are also part of Nubank’s infrastructure as code and therefore are also represented using Nubank definition files. The following example is the definition of an alert that aims to capture sudden spikes in Kafka consumer latency:

 

{:name :kafka-average-consume-latency-time-ms-too-high

:squad :runtime

:environments {:prod    #nu/prototypes-for [:prod :sharded+global+monitoring]

:staging #nu/prototypes-for [:staging :sharded+global+monitoring]

:test    #nu/prototypes-for [:test :nu+mobile]}

:expr ["avg(kafka_network_total_time_ms_fetch_consumer_95thpercentile) by (environment, prototype, stack_id) > " :threshold]

:threshold 600

:default-filter-labels [:squad]

:for-minutes 10

:alertmanager-labels {:severity "warning"}

:annotations {:stack       "--"

:instance  "average for all instances"

:value       ""}}

 

Conclusion

Nubank extensively uses AWS Cloudformation, and currently, they have 3000+ stacks that are used to manage thousands of cloud resources across multiple AWS Regions. Nubank builds AWS CloudFormation stacks to handle simple scenarios like the provisioning of single Amazon EC2 machines, but they also build stacks to handle the complex and interdependent infrastructure used to support Kafka and Kubernetes clusters.

The following diagram shows the AWS CloudFormation template for one of the Nubank’s Kubernetes clusters:

Nubank’s engineering team is working to make Nimbus part of a fully automated process, thus eliminating the need for human interaction through a command-line interface (CLI). This level of automation allows new cloud resources to be provisioned as soon as a Nubank definition file is modified and exported to Amazon S3. However, it will also automatically rollback if strange and potentially dangerous behavior is detected after deployment.

By leveraging AWS Cloudformation, Nubank can create deployment units of complex infrastructure resources while keeping the overall management complexity in check.

 

About the Authors

Bruno Halley Schaefer is a software engineer at Nubank. As a member of the platform team, he helps design automation tools to support Nubank in a hyper-growth scenario.

 

 

 

Hugo Carvalho is a senior solutions architect at AWS who specializes in helping startups at different maturity levels build sustainable and highly scalable platforms in the cloud. With more than seven years of experience in technology and tech team management, Hugo has helped many companies to define, implement, and evolve tech solutions for different market segments.

 

 

Marcelo Nunes is an AWS senior technical account manager, supporting Nubank since 2017. With more than 20 years of experience in technology and as a member of the AWS Enterprise Support Team since 2015, he has helped many companies on their AWS journey.

 

 

from AWS Management Tools Blog

How to Detect and Mitigate Guardrail Violation with AWS Control Tower

How to Detect and Mitigate Guardrail Violation with AWS Control Tower

Many companies that I work with would like to innovate fast in the cloud by adopting a self-service infrastructure provisioning model in a multi-account environment. However, maintaining security and governance in such a model is an organizational challenge. Without structured guardrails and baseline configuration enforcement, troubleshooting and mitigating risk can be cumbersome. AWS Control Tower automates account provisioning with consistent baseline configuration and simplifies multi-account compliance governance with prescriptive blueprints and best practices.

In this blog post, I will show you how to create a child organizational unit (OU) and an account under AWS Control Tower management. Then, I will show you how to associate a guardrail with the OU, detect a guardrail violation, mitigate the issue, analyze configuration and compliance history. I will be assuming that you already have a working version of AWS Control Tower. You can configure AWS Control Tower by following Jeff Barr’s blog post here.

Background

AWS Control Tower sets up three baseline accounts (Master, Log archive, and Audit) that provide dedicated environments for specialized roles within your organization.

Here is a brief description of each baseline account:

  • Master account contains landing zone configuration, account configuration StackSets, AWS Organizations Service Control Policies (SCPs), and AWS Single Sign-On (SSO) to manage and create new child accounts.
  • Log archive account creates a central Amazon S3 bucket for storing a copy of all AWS CloudTrail and AWS Config log files. A replica of AWS CloudTrail logs is created locally in each child account for operational use.
  • Audit account is used to receive security and compliance notification from Amazon CloudWatch Events, AWS CloudTrail, and AWS Config for all managed child accounts. AWS Control Tower uses Amazon Simple Notification Service (SNS) to send alerts to the unique email addresses that you provided for Master and Audit accounts during initial setup.

Baseline accounts and accounts are grouped into organizational units (OUs) for centralized accounts management.

The Master account is managed under Root OU while Log archive and Audit accounts are managed under Core OU. You can create new accounts under Custom OU or any additional child OUs to fit your organizational needs. The diagram below shows a high-level organizational structure within AWS Control Tower. I created an account called DevAccount1 under a child OU called DevOU.

AWS Control Tower provides guardrails based on prescriptive and pre-configured governance rules that can be enabled at the organizational units (OUs) level to enforce and detect compliance in a multi-account environment. To learn more, read the Guardrails in AWS Control Tower documentation.

Solution Overview

The following diagram outlines the process flow:

  1. Administrator creates a child OU and an end user in the Master account.
  2. End user creates an account under the child OU created in step 1 within AWS Control Tower management.
  3. Administrator enables a guardrail at the child OU.
  4. Administrator receives alert about guardrail violation.
  5. End user mitigates guardrail violation.
  6. End user analyzes the configuration and compliance history.

Step 1: Create a child OU and an end user

To create a child OU:

  1. Logon to AWS Control Tower console as AWS Control Tower administrator, https://console.aws.amazon.com/controltower
  2. Navigate to Organizational units from the left panel, select Add an OU button.
  3. Specify a name for the child OU, call it DevOU.

To create an end user:

  1. While in the AWS Control Tower console as administrator, navigate to User and access, make a note of User portal URL.
  2. Select View in AWS Single Sign-On which opens up AWS SSO console. Select Manage your directory.
  3. Select Add user, fill out the user details including email address, name, and phone number. You can select whether to generate a one-time password that you can share with the user or send an email to the user with password setup instructions.
  4. In the next window, add the user to AWSAccountFactory group which provides the required permissions to launch a new account. Sign out from the AWS SSO console.

Step 2: Create an account with AWS Control Tower

To create an account:

  1. Open User portal URL in a new browser window, sign in to Master account as the end user. Select Management console link next to AWSServiceCatalogEndUserAccess.
  2. Navigate to AWS Service Catalog console, select Product list, select Launch product from AWS Control Tower Account Factory.
  3. Enter a name for this product.
  4. Enter the end user email that you created earlier for SSOUserEmail and a unique AccountEmail. The owner of SSOUserEmail will have administrative access to the new account. AccountEmail must be unique and isn’t already associated with an AWS account. Fill out the SSOUserFirstName and SSOUserLastName fields. Select DevOU as the managed OU and enter DevAccount1 in the AccountName.
  5. Take default values in the next few prompts. Review the settings and select Launch. The new account provisioning process takes approximately 30 minutes.
  6. Once completes, you should see the new account (DevAccount1) under Provisioned products list. Sign out from AWS Service Catalog Console.                

Step 3: Enable a guardrail at the child OU

In this section, I will enable a guardrail at DevOU to detect unrestricted inbound SSH access to TCP port 22 in all child accounts under this OU. As a security group best practice, restricting inbound SSH access to only authorized IP addresses or CIDR range can avoid external attacks that are constantly looking for open ports to intrude EC2 instances. Please refer to this link for more information on how to authorize inbound traffic to your EC2 instances.

To enable a guardrail on an OU:

  1. Sign in to AWS Control Tower Master account as admin user.
  2. From the left panel, choose Guardrails, select Disallow internet connection through SSH.
  3. Choose Enable guardrail on OU button. Select DevOU and confirm the selection by choosing the Enable guardrail on OU.
  4. I can see a list of OUs that have been enabled with this guardrail by navigating to Guardrails, select Disallow internet connection through SSH, review the Organizational units enabled pane.
  5. The Audit account receives emails about the enforcement of this guardrail in each region that AWS Control Tower is currently available. At the time of this writing, AWS Control Tower is available in US East (Ohio), Europe (Ireland), US East (N. Virginia), and US West (Oregon).

For details on other guardrails, please refer to the documentation.

Step 4: Detect guardrail violation

Suppose one of the end users accidentally created or modified a security group to allow unrestricted SSH inbound access over port 22, the Audit account will be alerted of this compliance status change.

The owner of the email associated with the Audit Account will receive an email similar to the one below which shows the information about the noncompliant resource such as event timestamp, account ID and the region where this resource resides in, and AWS Config rule (AWSControlTower_AWS-GR_RESTRICTED_SSH) that detects the compliance status transition.

Note there will be a manual SNS subscription, one per each supported region, for the sample compliance email notification above during initial AWS Control Tower launch. Any extended team such as account administrators or security teams can subscribe to the same SNS topic by following the steps below:

  1. Sign in to the Audit account via user portal as admin user.
  2. Go to Amazon SNS console, select aws-controltower-AggregateSecurityNotifications.
  3. Select Create subscription
  4. Choose the appropriate type of endpoint from the Protocol list. In this case, I select Email and provide a valid email address in the Endpoint field.                     
  5. The owner of the new subscriber confirms the SNS topic subscription in the email similar to the one below to receive future compliance notification.        
  6. In AWS Control Tower console, I get a centralized dashboard that shows the noncompliant resource and compliance status of all child OUs and accounts.             

Step 5: Mitigate guardrail violation

To mitigate the noncompliant resource:

  1. Sign in to User portal URL as end user, select DevAccount1.
  2. Navigate to EC2 Dashboard, find the Security Group that contains the noncompliant configuration.
  3. Under Inbound tab, I select Edit to update the inbound rule and restrict SSH access to my IP address.
  4. Approximately five minutes after remediating the guardrail violation, Audit account receives an email that the resource is now compliant. Here is a sample email about the compliance status change.              
  5. In the AWS Control Tower Dashboard, I can validate the overall compliance status of my multi-account environment is now back to green.

Step 6: Analyze configuration and compliance history

For post remediation verification:

  1. I navigate to AWS Config console in DevAccount1 as end user, select Rules, then AWSControlTower_AWS-GR_RESTRICTED_SSH.
  2. Select the appropriate filter from Compliance status drop-down menu, select the resource to explore Configuration timeline or Compliance timeline.
  3. In Configuration timeline tab, I can review the changes and events for this resource.
  4. Select Changes to view the configuration that triggered the compliance status transition. 
  5. Scroll down the page, I can also find links to AWS CloudTrail that record full event details of each compliance transition for this resource.
  6. In Compliance Timeline tab, I can view the compliance status transition for this resource.

Conclusion

In this blog post, you saw how to create child OUs and account under AWS Control Tower management. I showed you how to associate guardrails managed by AWS Control Tower and detect when they are out of compliance. I also showed you how to mitigate issues, analyze configuration and compliance history. Now you can enable self-service infrastructure provisioning with AWS Control Tower guardrails to accelerate innovation with your organization.

 

About the Author

Cher Simon is a Senior Technical Account Manager at AWS. She works with customers building solutions in the cloud.

 

from AWS Management Tools Blog

Automating account administration using AWS Systems Manager

Automating account administration using AWS Systems Manager

This post focuses on one way Dedalus, an AWS Premier Consulting Partner based out of Brazil, maintains agility and control over their customer environments, by using AWS Systems Manager Automation to simplify everyday administration tasks and perform configuration management at scale on Amazon EC2 instances.

Dedalus, an AWS Premier Consulting Partner based in Brazil, who is also a managed services provider with decades of systems integrator expertise, has been helping Enterprise companies successfully navigate their cloud journeys. As a service provider, Dedalus continuously looks for ways to move the needle when it comes to innovation, operational excellence, and security. This was such the case when the company was looking for an effective way of managing customer environments in the AWS Cloud.

There were various objectives Dedalus wanted to address as part of their cloud management strategy in AWS, including reducing privileged access to customer resources, robust monitoring and logging with automated notifications when issues arose, and the ability to automate account administration so engineers could focus on innovation to further benefit their customers.

Overview
To manage customer environments at scale, Dedalus has a repeatable and consistent configuration management process. When onboarding new customer accounts, it is important that all necessary components for the customer are installed safely using AWS Systems Manager Run Command being executed from custom Automation Documents, with monitoring in place to ensure successful progress, with a recovery process, if needed.

Prior to automating administrations tasks, operations staff were required to maintain multiple scripts to administer each of their customer environments, with administrator access to the environments because the scripts ran in the operating system run time.  These scripts were run manually, which resulted in staff spending time to ensure the successful completion of the configuration tasks and if issues were encountered, time consuming recovery procedures.

When moving the administration tasks to AWS Systems Manager Automation, privileges were restricted to Automation Document execution, because the Document has the ability to assume a role with the appropriate permissions to execute administration tasks.  Additionally, prior to making any changes, an AMI of the changed instances are created, so that the recovery process can be less effort intensive.

By leveraging AWS Systems Manager Automation, which integrates with other AWS services, logging of all the executions can be stored and archived into Amazon Simple Storage Service (S3) or stream to Amazon CloudWatch Logs, which are used for notifications of any execution failures.  Also, auditing capabilities were enhanced because access to the AWS service APIs are tracked, and the potential for human error was reduced because all administrative tasks are automated.

Solution
Dedalus uses multiple custom Systems Manager Automation documents, one targeted for each operating system platform that needs support, mainly for additional flexibility and less reliance on a single Automation Document:

  • Windows – Using PowerShell for configuration management.
  • Linux – Using a combination of Linux shell scripts and Ansible playbooks.

These Automation documents install and configure the components needed to maintain various customer specific software and required monitoring services, including the Amazon CloudWatch agent.

Additional benefits gained by using Systems Manager Automation include:

  • Managing the distribution and execution of the scripts efficiently, managing the code centrally, and having a secure and scalable deployment mechanism with appropriate logging levels.
  • Executing the commands across more than one thousand customer environments without the need to grant the operations team privileges to the underlying infrastructure.
  • Retaining execution logs for future auditing and dashboarding.
  • Being flexible to target customer environments in various ways, for example resource tags and resource groups.
  • Sending notifications for failed executions for quick troubleshooting.
  • Focusing on more complex engineering problems, as opposed to repetitive administrative tasks.

·

Summary
This solution is just one of the ways Dedalus innovates for its customers while maintaining an appropriate level of governance, repeatable best practices, and standardization across more than a thousand customer environments. As a long-standing AWS Premier Partner, Dedalus continues to use Systems Manager Automation to evolve its DevOps practice and enhance the level of service that it provides its customers.

To learn more about Running Automation Workflows in Multiple AWS Regions and Accounts, please visit AWS documentation or read the Centralized multi-account and multi-Region patching with AWS Systems Manager Automation blog for a practical example.

from AWS Management Tools Blog

Amazon S3 bucket compliance using AWS Config Auto Remediation feature

Amazon S3 bucket compliance using AWS Config Auto Remediation feature

AWS Config keeps track of the configuration of your AWS resources and their relationships to your other resources. It can also evaluate those AWS resources for compliance. This service uses rules that can be configured to evaluate AWS resources against desired configurations.

For example, there are AWS Config rules that check whether or not your Amazon S3 buckets have logging enabled or your IAM users have an MFA device enabled. AWS Config rules use AWS Lambda functions to perform the compliance evaluations, and the Lambda functions return the compliance status of the evaluated resources as compliant or noncompliant. The noncompliant resources are remediated using the remediation action associated to the AWS Config rule. With Auto Remediation feature of AWS Config rules, the remediation action can be executed automatically when a resource is found non-compliant.

Until now, remediation actions had to be executed manually for each noncompliant resource. This is not always feasible if you have many noncompliant resources for which you want to execute remediation actions. It can also pose risks if these resources remain without remediation for an extended amount of time.

In this post, you learn how to use the new AWS Config Auto Remediation feature on a noncompliant S3 bucket to ensure it is remediated automatically.

Overview

The AWS Config Auto Remediation feature automatically remediates non-compliant resources evaluated by AWS Config rules. You can associate remediation actions with AWS Config rules and choose to execute them automatically to address non-compliant resources without manual intervention.

You can:

  • Choose the remediation action you want to associate from a pre populated list.
  • Create your own custom remediation actions using AWS Systems Manager Automation documents.

If a resource is still non-compliant after auto remediation, you can set the rule to try auto remediation again.

Solution

This post describes how to use the AWS Config Auto Remediation feature to auto remediate any non-compliant S3 buckets using the following AWS Config rules:

  • s3-bucket-logging-enabled
  • s3-bucket-server-side-encryption-enabled
  • s3-bucket-public-read-prohibited
  • s3-bucket-public-write-prohibited

These AWS Config rules act as controls to prevent any non-compliant S3 activities.

Prerequisites

Make sure you have the following prerequisites before following the solution in this post:

  • You must have AWS Config enabled in your AWS account. For more information, see Getting Started with AWS Config.
  • The AutomationAssumeRole in the remediation action parameters should be assumable by SSM. The user must have pass-role permissions for that role when they create the remediation action in AWS Config, and that role must have whatever permissions the SSM document requires. For example, it may need “s3:PutEncryptionConfiguration” or something else specific to the API call that SSM uses.
  • (Optional): While setting up remediation action, if you want to pass the resource ID of non-compliant resources to the remediation action, choose Resource ID parameter. If selected, at runtime that parameter is substituted with the ID of the resource to be remediated. Each parameter has either a static value or a dynamic value. If you do not choose a specific resource ID parameter from the drop-down list, you can enter values for each key. If you choose a resource ID parameter from the drop-down list, you can enter values for all the other keys except the selected resource ID parameter.

Steps

Use the following steps to set up Auto Remediation for each of the four AWS Config rules.

To set up Auto Remediation for s3-bucket-logging-enabled

The “s3-bucket-logging-enabled” AWS Config rule checks whether logging is enabled for your S3 buckets. Use the following steps to auto-remediate an S3 bucket whose logging is not enabled:

  1. Sign in to the AWS Management Console and open the AWS Config console.
  2. On the left pane, choose Rules
  3. On the Rules page, under Rule name, select s3-bucket-logging-enabled and then choose Add rule to add it to the rule list. (If the rule already exists, select it from the rule list and then choose Edit.)  There is one bucket named “tests3loggingnotenabled” which shows as a non-compliant resource under “s3-bucket-logging-enabled” rule.
  4. Return to the Rules page and choose Edit.
  5. In the Choose remediation action section, from the Remediation action list, select AWS-ConfigureS3BucketLogging. (AWS-ConfigureS3BucketLogging is an AWS SSM Automation document that enables logging on an S3 bucket using SSM Automation.)
  6. In the Auto remediation section, select Yes to automatically remediate non-compliant resources.
  7. In the Parameters section, enter the values for the required parameters such as AutomationAssumeRole, Grantee details required to execute the remediation action, and the Target bucket to store logs.
  8. Choose Save. The “s3-bucket-logging-enabled” AWS Config rule can now auto-remediate non-compliant resources. A confirmation that it executed the remediation action shows in the Action status column.S3 bucket Server access logging is now enabled automatically using the AWS Config Auto Remediation feature.

To set up Auto Remediation for s3-bucket-server-side-encryption-enabled

The “s3-bucket-server-side-encryption-enabled” AWS Config rule checks that your S3 bucket either has S3 default encryption enabled or that the S3 bucket policy explicitly denies put-object requests without server side encryption.

  1. Sign in to the AWS Management Console and open the AWS Config console 
  2. On the left pane, choose Rules
  3. On the Rules page, under Rule name, select s3-bucket-server-side-encryption-enabled and then choose Add rule to add it to the rule list. (If the rule already exists, select it from the rule list and then choose Edit.)There is one S3 bucket named “s3notencrypted” which is shown as a non-compliant resource under “s3-bucket-server-side-encryption-enabled” rule.
  4. Return to the Rules page and choose Edit.
  5. In the Choose remediation action section, from the Remediation action list, select AWS-EnableS3BucketEncryption. (AWS-EnableS3BucketEncryption is an AWS SSM Automation document that enables server-side encryption on an S3 bucket using SSM Automation. )
  6. In the Auto remediation section, select Yes to automatically remediate non-compliant resources.
  7. In the Parameters section, enter the values for AutomationAssumeRole, SSE algorithm required to execute the remediation action.
  8. Choose Save. The “s3-bucket-server-side-encryption-enabled” AWS Config rule can now auto-remediate non-compliant resources. A confirmation that it executed the remediation action shows in the Action status column.S3 bucket server-side encryption is now enabled automatically using the AWS Config Auto Remediation feature.

To set up auto remediation for s3-bucket-public-read-prohibited and s3-bucket-public-write-prohibited

An AWS S3 bucket can be protected from public read and write using AWS Config rules “s3-bucket-public-read-prohibited” and “s3-bucket-public-write-prohibited” respectively. Enable these AWS Config rules as discussed in the above two scenarios and enable auto remediation feature with existing SSM Document remediation action “AWS-DisableS3BucketPublicReadWrite”. This remediation action disables an S3 bucket’s public Write and Read access via Block Public Access settings.

 

Conclusion

In this post, you saw how to auto-remediate non-compliant S3 resources using the AWS Config auto remediation feature for AWS Config rules. You can also use this feature to maintain compliance of other AWS resources using existing SSM documents or custom SSM documents. For more details, see Remediating Non-compliant AWS Resources by AWS Config Rules.

For pricing details on AWS Config rules, visit the AWS Config pricing page.

 

About the Author

Harshitha Putta is an Associate Consultant with AWS Professional Services in Seattle, WA. She is passionate about building innovative solutions using AWS services to help customers achieve their business objectives. She enjoys spending time with family and friends, playing board games and hiking.

from AWS Management Tools Blog

How to Create an AWS Cross-Account Support Case Dashboard

How to Create an AWS Cross-Account Support Case Dashboard

At AWS, our customer obsession drives us to leave no stone unturned in helping our customers achieve success. Therefore, when a customer finds an interesting way to create valuable functionality using a combination of AWS services, we want to let our other customers know about it so they can also reap the benefits. A great example that we’d like to share is Snap, Inc. creating a cross-account Support case dashboard using the AWS Support API, CloudTrail, Lambda, CloudFormation, and DyanmoDB. This is a great example of what is possible with the Support API and other tools that AWS provides. The cross-account case dashboard allows Snap to see case details across multiple accounts in a single location.

AWS customers open Support cases in order to fix issues, get more information on AWS services and how they can fit into customer solutions, get architecture guidance as they plan applications, and more. For Business and Enterprise Support customers, an unlimited number of users can open an unlimited number of cases. As such, any given customer could have dozens or even hundreds of support cases open at any given time. And that customer may have opened thousands of support cases over the years.

For customers, it can be valuable to have a centralized view of all the cases opened across all of its users. This makes it easier to reference past cases that may be relevant to current questions or issues, to share learnings across cases with others in the organization and improve productivity. Snap found the cross-account insight especially useful as a single dashboard where developers can easily search and discover insights that can be shared across the organization.

 

For other customers interested in creating a centralized view of Support cases in a multi-account environment, here is how Snap created their own dashboard using a combination of AWS Support API, CloudTrail, Lambda, CloudFormation, IAM, and DynamoDB.

Of course, some initial assembly was required. In order to be able to describe support cases, Snap first created a “GetSupportInfoRole” IAM role within the account with the right permissions. They then granted the ability to assume this IAM role to a central “SupportAggregator” role. Snap also ensured that Enterprise Support was enabled in every account, which is a requirement for advanced support API calls.

Once these prerequisites were fulfilled, Snap launched a CloudFormation stack to create the rest of the pipeline: a “GetSupportInfoRole” IAM role, a central S3 bucket, Lambdas, SNS topics, and a DynamoDB table to store the results. Finally, to react to Support case CloudTrail events, Snap created a multi-region CloudTrail trail in every account in their AWS Organization, and had them all deposit their CloudTrail events to the previously created central S3 bucket. AWS CloudTrail organization trails make setting up these CloudTrails a simple, 5 minute affair.

Snap has shared a sample CloudFormation template that would make this a repeatable process, especially when paired with AWS CloudTrail organization trails. The CloudFormation stack deploys two Lambda functions, which don’t rely on any Snap internal libraries. Check it out here.

Increasingly, customers that manage many accounts for their organization have requested features to manage multi-account management. AWS Support will be implementing these features for customers that make use of AWS Organizations in 2020. Is centralized management of multi-account support cases a challenge you face? Is this centralized dashboard something you think you will get value out of and build for yourself? If so, please comment in the comments section below or create a Github issue here.

About the Authors

Shrikant is a security engineer at Snap Inc on the Infrastructure Security team. He is passionate about cloud security monitoring, cross cloud access patterns and Kubernetes security. He presented at AWS re:Invent 2018 on ‘How Snap Accomplishes Centralized Security and Configuration Governance on AWS. Shrikant can be reached here.

 

 

Roger is a software engineer at Snap Inc on the Infrastructure Security team. He is an AWS Certified Solutions Architect – Associate.
He enjoys building platforms for cloud resource monitoring and governance, as well as authentication services. Roger can be reached here.

 

from AWS Management Tools Blog

Enabling self-service provisioning of AWS resources with AWS Control Tower

Enabling self-service provisioning of AWS resources with AWS Control Tower

Customers provision new accounts in AWS Control Tower whenever they are on-boarding new business units or setting up application workloads. In some cases, organizations also want their cloud users, developers, and data scientists to deploy self-service standardized and secure patterns and architectures with the new account. Here are a few examples:

  • A developer or cloud engineer wants to launch an Amazon EC2 instance from a golden AMI.
  • A data scientist wishes to launch Amazon EMR clusters with approved AMIs and instance types.
  • A database administrator must launch an approved Amazon RDS database in a newly provisioned AWS account.

In this post, we show how you can use the Account Factory in AWS Control Tower to provision new AWS accounts. We also demonstrate how you can share custom products, such as a portfolio of RDS databases, to the new account with AWS Service Catalog. Additionally, we cover how you can use AWS Control Tower’s guardrails to enforce governance in your new account.

This solution uses the following AWS services:

Background

This post references the following concepts:

  1. AWS Control Tower offers customers a mechanism to easily provision new accounts in a secure and compliant environment, built according to AWS best practices.
  2. Customers can create new accounts using AWS Service Catalog through its UI interface or the CLI.
  3. AWS Control Tower guardrails are high-level rules providing ongoing governance for your AWS environment.
  4. An AWS Service Catalog product is an IT service you want to make available for deployment. You can create a product by importing an AWS CloudFormation template.  Portfolios are a collection of products, together with configuration information.
  5. Amazon RDS sets up, operates, and scales a relational database in the cloud.
  6. AWS Organizations allows you to govern access to AWS services, resources, and Regions.

Before getting started, it helps to understand why you need a new account. Here are some questions to consider:

  • Will the new account help with managing account limits for a large application?
  • Are you trying to maintain billing separation at the account level?
  • Is this a test, development, or production account?
  • Are you building a sandbox environment for your developers?
  • Which AWS Organizations business unit should the account fall under?

Solution overview

The following diagrams map out the solution architecture.

Figure 1. Create a new account in AWS Control Tower with policies in place.

Figure 1. Create a new account in AWS Control Tower with policies in place.

Figure 2. Creating and sharing an AWS Service Catalog portfolio to the new account

Figure 2. Creating and sharing an AWS Service Catalog portfolio to the new account

 

Here we walk through the basic steps from setting up an AWS Organization unit and applying AWS Control Tower guardrails to developing and sharing a portfolio to a newly provisioned account.

Walkthrough

Create a new account in AWS Control Tower with policies in place:

Step 1: Add a new organizational unit.

Step 2: Apply the AWS Control Tower guardrails.

Step 3: Create a new account.

Create and share an AWS Service Catalog portfolio to the new account:

Step 1: Create and share a portfolio in the master account.

Step 2: Generate the local portfolio in the spoke account.

Step 3: Launch the product from the spoke account.

Step 4: Validate guardrail detection.

Prerequisites

This post assumes that you have already set up an AWS Control Tower environment.

Creating a new account in AWS Control Tower

For this post, we create a new account under the Data Analytics organizational unit.

Step 1: Add a new organizational unit

In the AWS Control Tower console, choose Organizational units, Add an OU.

Once the OU is created, select the OU and make a note of the organizational unit ID. You need it for the CloudFormation script used in the Creating and sharing an AWS Service Catalog portfolio to the new account section.

Step 2: Apply the AWS Control Tower guardrails

The AWS Control Tower data security guardrails provide ongoing detective governance to any account under this OU. AWS highly recommends that you apply all of these guardrails to all accounts.

Data security guardrails do the following:

  • Disallow public access to RDS database instances.
  • Disallow public access to RDS database snapshots.
  • Disallow RDS database instances that are not storage-encrypted.

In the left navigation pane, choose Guardrails. Select Disallow RDS database instances that are not storage encrypted. On the Enable guardrail on OU page, enable the guardrail for the Data Analytics OU.

Step 3: Create a new account

The next step is to provision the new account using Account Factory. Account Factory is an AWS Service Catalog product created during the setup of AWS Control Tower. For more information, see Configuring and Provisioning Accounts Through AWS Service Catalog.

The new account creation process typically takes 30–60 minutes to finish.

After completing account creation, confirm that the account is under the Data Analytics unit.

Creating and sharing an AWS Service Catalog portfolio to the new account

Here, we create a portfolio of self-service products in the master account, which is then shared with the newly created account. Feel free to substitute your own AWS Service Catalog products.

Step 1: Create and share portfolio in the master account

The template for the products used in this post is configure_ct_portfolio.yaml.

This CloudFormation template shares the AWS Service Catalog portfolio with the Data Analytics OU that you created earlier. Follow these steps:

  1. Use the following button to launch the AWS CloudFormation stack.
    Launch Stack
  2. Choose Next.
  3. For Organization Unit to Share, enter the organizational unit ID that you noted earlier.
  4. Choose Next, then Next again.
  5. Select I acknowledge that AWS CloudFormation might create IAM resources.
  6. Choose Create.

Make a note of the output values of the stack. You enter these as input parameters in the next step.

Output values of the stack

Output values of the stack

This creates a Service Catalog portfolio in the AWS Control Tower master account, which contains the products listed earlier. It also shares the portfolio with all the AWS accounts in the OU that you specified.

AWS Service Catalog portfolio on master account

AWS Service Catalog portfolio on master account

Step 2: Generate a local portfolio in the spoke account

The second step is to run  an AWS CloudFormation stack set from the master account and deploy it on the portfolio organizational unit that you shared from the master account. You will enter the output values that you noted from the previous step as the parameter fields for this AWS CloudFormation stack set.

This stack generates a local portfolio, adds the self-service products from the organization, and adds appropriate launch constraints and tags.

You can also choose to add template constraints to provide additional preventive measures, including limiting launches to only certain instance types or database versions.

  1. Use the following button to launch the AWS CloudFormation stack set.
    Launch Stack
  2. Choose Next.
  3. For the Master Portfolio parameter, enter the master portfolio ID that you noted earlier.
  4. Choose Next, then Next again.
  5. Select I acknowledge that AWS CloudFormation might create IAM resources.
  6. Choose Create.

Step 3: Launch product from the spoke account

Configure AWS Single Sign-On (AWS SSO) to allow the end user to access the spoke accounts with required permissions. The AWS SSO user should have minimum AWS Service Catalog end-user permissions. Grant access to the user on the local portfolio of the spoke account. After you complete the CloudFormation deployment in the new account, end users can now launch RDS products.

List of available products in the spoke account

List of available products in the spoke account

Step 4: Validate guardrail detection

The detective RDS guardrails now govern the database resources in the new AWS account. If users launch an RDS instance with no encryption, the AWS Control Tower guardrail detects the non-compliant resource. AWS Control Tower’s dashboard will provide visibility across the multi-account environment, highlighting the accounts with a non-compliant status.

An example of a non-compliant database resource from the AWS Control Tower dashboard

An example of a non-compliant database resource from the AWS Control Tower dashboard

Conclusion

This post guided you through creating a new account with AWS Control Tower and applying guardrails to it. We also showed you how to share a standardized portfolio of AWS Service Catalog products with the new account. Finally, we showcased how AWS Control Tower guardrails can quickly detect non-compliant resources.

We welcome your feedback. Please let us know if you have any comments or questions on this.

Further Reading

About the authors

 

Nivas Durairaj is a senior business development manager for AWS Service Catalog and AWS Control Tower. He enjoys guiding and helping customers on their cloud journeys. Outside of work, Nivas likes playing tennis, hiking, doing yoga and traveling around the world.

 

 

Author Kishore

 

Kishore Vinjam is a partner solutions architect focusing on AWS Service Catalog, AWS Control Tower, and AWS Marketplace. He is passionate about working on cloud technologies, working with customers, and building solutions. When not working, Kishore likes to spend time with family, hike, and play volleyball and ping-pong.

 

 

from AWS Management Tools Blog

How to self-service manage AWS Auto Scaling groups and Amazon Redshift with AWS Service Catalog Service Actions

How to self-service manage AWS Auto Scaling groups and Amazon Redshift with AWS Service Catalog Service Actions

Some of the customers I work with provide AWS Service Catalog products to their end-users to enable self-service for launching and managing Amazon Redshift, EMR clusters or web applications at scale using AWS Auto Scaling groups. These end-users would like the ability to self-manage these resources, for example, be able to take a snapshot of an instance or data warehouse.  With AWS Service Catalog, end-users can launch data warehouse products using Redshift, a web farm using EC2 or a Hadoop instance using EMR.

In this blog post, I will show you how to enable your end-users by creating self-service actions using AWS Service Catalog Service Actions with AWS Systems Manager. You will also learn how to use the Service Actions feature to manage these products, for example, how to start or stop EC2 instances running under an auto scaling group and also how to backup EC2 and Redshift.

This solution uses the following AWS services. Most of the resources are set up for you with an AWS CloudFormation stack:

Background

Here are some of AWS Service Catalog concepts referenced in this post. For more information, see Overview of AWS Service Catalog.

  • A product is a blueprint for building the AWS resources to make available for deployment on AWS, along with the configuration information. Create a product by importing an AWS CloudFormation template, or, in case of AWS Marketplace-based products, by copying the product to AWS Service Catalog. A product can belong to multiple portfolios.
  • A portfolio is a collection of products, together with the configuration information. Use portfolios to manage user access to specific products. You can grant portfolio access for an AWS Identity and Access Management (IAM) user, IAM group, or IAM role level.
  • A provisioned product is an AWS CloudFormation stack; that is, the AWS resources that are created. When an end-user launches a product, AWS Service Catalog provisions the product from an AWS CloudFormation stack.
  • Constraints control the way users can deploy a product. With launch constraints, you can specify a role that the AWS Service Catalog can assume to launch a product.

Solution overview­

The following diagram maps out the solution architecture.

 

Here’s the process for the administrator:

  1. The administrator creates an AWS CloudFormation template for an auto scaling group.
  2. The administrator then creates an AWS Service Catalog product based on the CloudFormation template
  3. AWS Systems Manager is then used to create a SSM automation document that will manage the EC2 instances under an auto scaling group and Redshift cluster. An AWS Service Catalog self service action is then created based on the automation documents and attached to the AWS Service Catalog auto scaling group and Redshift product.

Here’s the process when the end-user launches the auto scaling group product:

  1. The end-user selects and launches an AWS Service Catalog auto scaling group or the Redshift product.
  2. The end-user uses the AWS Service Catalog console to select the auto scaling group, or the Redshift product then chooses the self-service action to stop or start the EC2 instances or create a snapshot of Redshift.
  3. Behind the scene, invisible to the end-user the SSM automation document stops or starts the EC2 instances or takes a snapshot of Redshift.

Step 1: Configuring an environment

To get the setup material:

  1. Download the sc_ssm_autoscale.zip file with the configuration content.
  2. Unzip the contents and save them to a folder. Note the folder’s location.

Create your AWS Service Catalog auto scaling group and Redshift products:

  1. Log in to your AWS account as an administrator. Ensure that you have an AdministratorAccess IAM policy attached to your login because you’re going to create AWS resources.
  2. In the Amazon S3 console, create a bucket. Leave the default values except as noted.
    •  Bucket name – scssmblog-<accountNumber>. (No dashes in the account number e.g. scssmblog-999999902040)

To upload content to the new bucket:

  1. Select your bucket, and choose Upload, Add files.
  2. Navigate to the folder that contains the configuration content. Select all the files and choose Open. Leave the default values except as noted.
  3. After the Review page, from the list of files, select the sc_setup_ssm_autoscale.json file.
  4. Right-click the link under Object URL and choose Copy link address.

To launch the configuration stack:

  1. In the AWS CloudFormation console, choose Create Stack, Amazon S3 URL, paste the URL you just copied, and then choose Next.
  2. On the Specify stack details page, specify the following:
    1. Stack name: scssmblogSetup
    2. S3Bucket: scssmblog-<accountNumber>
    3. SCEndUser: The current user name
  3. Leave the default values except as noted.
  4. On the Review page, check the box next to I acknowledge that AWS CloudFormation might create IAM resources with custom names, and choose Create.
  5. After the status of the stack changes to CREATE COMPLETE, select the stack and choose Outputs to see the output.

Find the ServiceCatalog entry choose the URL to the right.

Congratulations! You have completed the setup.

Step 2: Creating the AWS SSM automation document

You will repeat these steps for the Redshift snapshot document.

  1. Open the file ssmasg_stop.json for ASG action redshift_snapshot.json for the Redshift document next you downloaded in the previous step.
  2. Copy the contents.
  3. Log into the AWS Systems Manager console as an admin user.
  4. Choose Documents from the menu at the bottom left.
  5. Choose Create document:
    • Name – SCAutoScalingEC2stop for ASG – SCSnapshotstop for Redshift
    • Target Type
      • /AWS::AutoScaling::AutoScalingGroup for ASG
      • /AWS::Redshift::Cluster   for Redshift
    • Document type – Automation document
    • JSON – paste the content you copied from step 2
    • Choose Create document

You will see a green banner saying your document was successfully created.

 

Step 3: Create a AWS Service Catalog self-service action

  1. Log into the AWS Service Catalog console as an admin user.
  2. On the left navigation pane, choose Service actions.
  3. Choose Create new action.
  4. On the Define page choose Custom documents.
  5. Choose the document you just created for ASG.
  6. Choose Next.
  7. On the Configure page, leave the default values.
  8. Choose Create action.

You will see a banner saying the product has been created and is now ready to use.
Repeat for the Redshift product.

Step 4: Associate action to the product

  1. On the Service actions page, choose the action you created.
  2. Choose Associate action.
  3. Choose the AutoScaling product.
  4. Choose the Version.
  5. Choose Associate action.

Repeat for the Redshift product.

Congratulation! your new service action has been associated with the product. The next step is to deploy. the AutoScaling and Redshift products and use the new self-service action.

Step 5: Launching the AWS Service Catalog product

Redshift

  1. Log into the AWS Service Catalog console as an admin or end-user.
  2. On the left navigation pane on top, choose Products list.
  3. Choose the Redshift product.
  4. Choose LAUNCH PRODUCT.
  5. Enter a name – myredshift
  6. Choose Next.
  7. On the Parameters page:
    • DBName – mydb001
    • MasterUserPassword – enter a password
  8. Choose Next.
  9. On the TagOptions page choose Next.
  10. On the Notifications page choose Next.
  11. On the Review page choose Launch.

Auto Scaling Group

  1. Log into the AWS Service Catalog console as an admin or end-user.
  2. On the left navigation pane on top, choose Products list.
  3. Choose the AutoScaling product.
  4. Choose LAUNCH PRODUCT.
  5. Enter a name – myscacg
  6. Choose Next.
  7. On the Parameters page:
    • Serverpostfix – default
    • Imageid – enter an amz-linux ami for your current region
  8. Choose Next.
  9. On the TagOptions page choose Next.
  10. On the Notifications page choose Next.
  11. On the Review page choose Launch.

Wait for the status to change to Completed.

 

Step 6: Executing the self-service action

Auto Scaling Group

  1. Choose Actions.
  2. Choose the self-service action you created SCAutoScalingEC2stop.
  3. Choose RUN ACTION to confirm.

Redshift

  1. Choose Actions.
  2. Choose the self-service action you created SCSnapshotstop.
  3. Choose RUN ACTION to confirm.

Congratulations, you have successfully executed the new self-service action.

 

Cleanup process

To avoid incurring cost, please delete resources that are not needed. You can terminate the Service Catalog product deployed by selecting Action then Terminate.

 

Conclusion
In this post, you learned an easy way to backup Redshift databases and to manage EC2 instances in an auto scaling group. You also saw how there’s an extra layer of governance and control when you use AWS Service Catalog to deploy resources to support business objectives.

About the Author

Kenneth Walsh is a New York-based Solutions Architect focusing on AWS Marketplace. Kenneth is passionate about cloud computing and loves being a trusted advisor for his customers. When he’s not working with customers on their journey to the cloud, he enjoys cooking, audio books, movies, and spending time with his family and dog.

from AWS Management Tools Blog

Introducing Amazon CloudWatch Container Insights for Amazon ECS

Introducing Amazon CloudWatch Container Insights for Amazon ECS

Amazon Elastic Container Service (Amazon ECS) lets you monitor resources using Amazon CloudWatch, a service that provides metrics for CPU and memory reservation and cluster and services utilization. In the past, you had to enable custom monitoring of services and tasks. Now, you can monitor, troubleshoot, and set alarms for all your Amazon ECS resources using CloudWatch Container Insights. This fully managed service collects, aggregates, and summarizes Amazon ECS metrics and logs.

The CloudWatch Container Insights dashboard gives you access to the following information:

  • CPU and memory utilization
  • Task and service counts
  • Read/write storage
  • Network Rx/Tx
  • Container instance counts for clusters, services, and tasks
  • and more

Direct access to these metrics offers you much fuller insight into and control over your Amazon ECS resources.

With CloudWatch Container Insights, you can:

  • Gain access to CloudWatch Container Insights dashboard metrics
  • Integrate with CloudWatch Logs Insights to dynamically query and analyze container application and performance logs
  • Create CloudWatch alarm notifications to track performance and potential issues
  • Enable Container Insights with one click, no need for additional configuration or the use of sidecars to monitor your tasks.

CloudWatch Container Insights can also support Amazon Elastic Kubernetes Service (Amazon EKS).

Overview

In this post, I guide you through Container Insights setup and introduce you to the Container Insights dashboard. I demonstrate the Amazon ECS metrics available through this console. I show you how to query Container Insights performance logs to obtain specific metric data. Finally, I walk you through the default, automatically generated dashboard, explaining how to right-size tasks and scale services.

Enable Container Insights on new clusters by default

First, configure your Amazon ECS service to enable Container Insights by default for clusters created with your current IAM user or role.

  1. Open the Amazon ECS console.
  2. In the navigation pane, choose Account Settings.
  3. To enable the Container Insights default opt-in, check the box at the bottom of the page. If this setting is not enabled, Container Insights can be enabled later when creating a cluster.

Next, follow the Amazon ECS Workshop for AWS Fargate instructions to create an Amazon ECS cluster with Container Insights enabled. You are creating a cluster with a web frontend and multiple backend services.

When completed, access the frontend application using the URL of the inbound Application Load Balancer, as in the following diagram.

You can also access it by using the output of the following command:

alb_url=$(aws cloudformation describe-stacks --stack-name fargate-demo-alb --query 'Stacks[0].Outputs[?OutputKey==`ExternalUrl`].OutputValue' --output text)
echo "Open $alb_url in your browser"

Explore CloudWatch Container Insights

After the cluster is running, run the container tasks to confirm that Container Insights is enabled and CloudWatch is collecting Amazon ECS metrics.

To view the newly collected metrics, navigate to the CloudWatch console and choose Container Insights. To view your automatic dashboard, select an ECS Resource dimension. The available Amazon ECS options are ECS clusters, ECS services, and ECS tasks. Choose ECS Clusters.

The dashboard should then display the available metrics for your cluster. These metrics should include CPU, memory, tasks, services, and network utilization.

In the following dashboard example, the tasks are running on Fargate and are required to use AWSVPC networking mode. Container Insights doesn’t currently support AWSVPC networking mode for network metrics in this first release. You can see in the following graph that these metrics are omitted. This cluster was set up to support only Fargate tasks, and the container instance count is equal to zero.

At the bottom of the page, select an ECS cluster. Choose Actions, View performance logs. This selection leads to CloudWatch Logs Insights, where you can quickly and effectively query CloudWatch Logs data metrics. CloudWatch Logs Insights includes a purpose-built query language with simple but powerful commands. For more information, see Analyzing Log Data with CloudWatch Logs Insights.

Container Insights provides a default query that can be executed by choosing Run query. This query reports all of the metrics that CloudWatch collects from the cluster, minute by minute. You can expand each data item to investigate individual metrics.

With CloudWatch Container Insights, you can monitor an ECS cluster from a unified view, quickly identifying and responding to any operational issues.

Explore use cases

In this section, I explore how to use CloudWatch and Container Insights to manage your cluster. Start by generating traffic to your cluster. Create one web request per second to your frontend URL:

alb_url=$(aws cloudformation describe-stacks --stack-name fargate-demo-alb --query 'Stacks[0].Outputs[?OutputKey==`ExternalUrl`].OutputValue' --output text)
while true; do curl -so /dev/null $alb_url ; sleep 1 ; done &;

In the CloudWatch Insights dashboard, choose ECS Services, and select your cluster. As the following dashboard screenshot shows, CPU and memory utilization are minimal, and the frontend service can handle this load.

Next, simulate high traffic to the cluster using ApacheBench. The following ApacheBench command generates nine concurrent requests (-c 9), for 60 seconds (-t 60), and ignores length variances (-l). Loop this repeatedly every second.

while true; do ab -l -c 9 -t 60 $alb_url ; sleep 1; done

Task CPU increases significantly while Memory Utilization remains low.

Adjust the dashboard time range to see the average CPU and memory utilization for each task in the past 30 minutes, as shown in the following screenshot.

To see individual resource utilization average over 30 minutes, scroll to the bottom of the dashboard.

Select any task and choose View performance logs. This selection opens CloudWatch Logs Insights for the ECS cluster. In the query box, enter the following query and choose Run query:

stats avg(CpuUtilized), avg(MemoryUtilized) by bin (30m) as period, TaskDefinitionFamily, TaskDefinitionRevision 
| filter Type = "Task" | sort period desc, TaskDefinitionFamily

From the query result, the frontend service has an average CPU utilization of 229.2273 unit and memory utilization of 71 megabytes. The current frontend task configuration uses 512 MiB of memory and 256 unit of CPU. Your frontend service has used almost all CPU resources that it has reserved.

To improve the frontend service performance based on the CPU metrics, one option is to re-size the task. First, create a new revision of this task definition with a new CPU and Memory value. Then, update the frontend service to use the new revision.

Step 1: Create a new task definition revision.

  1. In the ECS console, choose Task Definitions.
  2. For ecsdemo-frontend, select the check box and choose Create new revision.
  3. On the Create new revision of Task Definition screen, under Container Definitions, choose ecsdemo-frontend.
  4. For Memory Limits (MiB), choose Soft limit and enter the value of 1024.
  5. Under Environment, for CPU units, enter 512.
  6. Choose Update.
  7. On the Create new revision of Task Definition screen, under Task size, choose 0.5 vCPU, which is 512 CPU unit.
  8. For Memory, choose 1GB.
  9. Choose Create.

Step 2: Update the frontend service with the new task definition.

  1. Choose Cluster and select the cluster.
  2. On the Services tab, for ecsdemo-frontend, select the check box and choose Update.
  3. For Task Definition, select the revision that you previously created in the first step.
  4. Choose Skip to review then Update Service.

ECS spins up the new tasks with this new revision of the task definition and removes the old ones.

As shown in the dashboard, the average percentage CPU Utilization remains high. The current load still stresses CPU. On the positive note, the load balancer for the frontend service can handle significantly more requests, from 25k–57k requests over 5 minutes.

The benchmarking result from ApacheBench shows the same evidence. From the client perspective, the frontend service is able to process over twice the requests. By increasing the CPU available for your task, you increase the frontend service’s ability to handle the load. Remember that the frontend service consists of three tasks and CPU usage remains high.

To continue to improve the frontend service performance based on the CPU metrics, increase task size or scale out the service. With the current load, the average RequestCountPerTarget value is around 8k/5-minute interval. Update the frontend service to automatically scale tasks and keep RequestCountPerTarget closed to 1000 requests per target.

Run the following command to update the frontend service, setting up Service Auto Scaling with a maximum of 25 tasks and a minimum of 3 tasks. For Scaling policy, use Target Tracking Scaling Policy with the target value of 1000 for RequestCountPerTarget.

cd ~/environment/fargate-demo
export clustername=$(aws cloudformation describe-stacks --stack-name fargate-demo --query 'Stacks[0].Outputs[?OutputKey==`ClusterName`].OutputValue' --output text)
export alb_arn=$(aws cloudformation describe-stack-resources --stack-name fargate-demo-alb | jq -r '.[][] | select(.ResourceType=="AWS::ElasticLoadBalancingV2::LoadBalancer").PhysicalResourceId')
export target_group_arn=$(aws cloudformation describe-stack-resources --stack-name fargate-demo-alb | jq -r '.[][] | select(.ResourceType=="AWS::ElasticLoadBalancingV2::TargetGroup").PhysicalResourceId')
export target_group_label=$(echo $target_group_arn |grep -o 'targetgroup.*')
export alb_label=$(echo $alb_arn |grep -o 'farga-Publi.*')
export resource_label="app/$alb_label/$target_group_label"

aws application-autoscaling register-scalable-target \
    --service-namespace ecs \
    --scalable-dimension ecs:service:DesiredCount \
    --resource-id service/${clustername}/ecsdemo-frontend \
    --min-capacity 3 \
    --max-capacity 25
 
envsubst <config.json.template >/tmp/config.json
 
aws application-autoscaling put-scaling-policy --cli-input-json file:///tmp/config.json

Now the frontend service starts to scale. CloudWatch Container Insights displays the number of tasks in each step of scaling.

 

From the load balancer of your frontend service request metric, you can see that your cluster can handle even more requests, which is approximately 153k requests or 6.1k request per target over 5 minutes. If you had not set the maximum number of tasks of 25 for the Auto Scaling policy, the frontend service would have scaled out even more as the threshold is 1000 requests per target.

You see the same evidence from ApacheBench: the frontend service is able to process more requests and the time per request is much smaller.

From the query result in CloudWatch Logs Insight, the average CPU utilization is at 158 out of the 512 CPU unit configured earlier, which is relatively low.

You can see that you can use these metrics from Container Insights to help fine-tune your cluster. Similarly to how you spotted a task that was suffering, you could spot an oversized task using similar techniques and reduce the configuration, saving money in turn.

To see the frontend service scale in, stop the load by pressing CRTL-C twice to cancel the ab loop. For curl loop, type fg to bring the process to the foreground, then press CRTL-C. Within a couple of minutes, the frontend services start and continue to scale in.

Conclusion

In the past, you had to implement custom metrics. Now, CloudWatch Container Insights for Amazon ECS helps you focus on monitoring and managing your application, so that you can respond quickly to operational issues. The service provides sharper insight into your Amazon ECS clusters, services, and tasks through added CPU and memory metrics. You can use CloudWatch Log Insights for right-sizing, alarms, scaling, and query performance log metrics that drive more informed analysis.

In this post, I introduced these new CloudWatch container metrics. I walked you through the default, automatically generated dashboard, showing you how to use the CloudWatch Container Insights console to right-size tasks and scale services. I dived deep into a performance log event provided by CloudWatch Logs Insights. I showed you how to use query language to find a specific metric’s value and choose the best value for right-sizing purposes.

CloudWatch Container Insights is generally available for Amazon ECS, AWS Fargate, Amazon EKS, and Kubernetes. For more information, see the documentation on Using Container Insights. To provide feedback or subscribe to email updates for this feature, email us at [email protected].

 

About the Author

Sirirat Kongdee is a Sr. Solutions Architect at Amazon Web Services. She loves working with customers and helping them remove roadblocks from their cloud journey. She enjoys traveling (whether for work or not) as much as she enjoys hanging out with her pug in front of the TV.

 

from AWS Management Tools Blog

Managing Amazon WorkSpaces by integrating AWS Service Catalog with ServiceNow

Managing Amazon WorkSpaces by integrating AWS Service Catalog with ServiceNow

As enterprises adopt Amazon WorkSpaces as their virtual desktop solution, there is a need to implement an ITSM-based self-service offering for provisioning and operations.

In this post, you will learn how to integrate AWS Service Catalog with ServiceNow so users can request their own WorkSpace instances inclusive of all business-level approvals and auditing. You will then see how to use Self-Service Actions to add operations functions directly from ServiceNow to allow users to reboot, terminate, repair, or upgrade their WorkSpaces.

Overview

AWS Service Catalog allows you to manage commonly deployed AWS services and provisioned software products centrally. This service helps your organization achieve consistent governance and compliance requirements, while enabling users to deploy only the approved AWS services they need.

ServiceNow is an enterprise service-management platform that places a service-oriented lens on the activities, tasks, and processes needed for a modern work environment. AWS Service Catalog is a self-service application through which end users can order IT services based on request fulfillment approvals and workflows, enabling you to approve a specific request within ServiceNow (for example, a request for a WorkSpace to be provisioned).

Solution

This solution shows how AWS Service Catalog can be used to enable a self-service lifecycle-management offering for Amazon WorkSpaces from within ServiceNow. Using this solution:

  • Users can provision, upgrade, and terminate their WorkSpace instance from within the ServiceNow portal.
    • At the request stage, users can select the instance size, type, and configuration parameters when creating their order in the AWS Service Catalog.
    • After the instance is created, the user can follow the same process to request service actions such as reboot, terminate, rebuild, or upgrade.
  •  ServiceNow admins can determine (based on IAM roles) which Amazon WorkSpaces software bundle each group of users installs by default.

The arrows in the following diagram depict the API flow between the services when users access Amazon WorkSpaces via ServiceNow and AWS Service Catalog.

 

Prerequisites

To get started, do the following:

1.       Install and configure the AWS Service Catalog connector for ServiceNow.

2.       Add an Amazon WorkSpaces product.

After installing the prerequisites, you have an AWS Service Catalog-provisioned product. Now, you can access the following Create WorkSpace Instance page to provision, upgrade, and terminate WorkSpace instances within ServiceNow.

 

 

Adding AWS Service Catalog operational actions

Next, you will add AWS Service Catalog Self Service Actions, enabling you to run an AWS API call or command on the Workspace instance, including:….”Install a software package.

·         Reboot a workspace instance.

·         Change performance modes.

·         Repair a workspace instance.

For each Service-Action that you want to create, you will need to add an AWS Systems Manager automation document. In this example, you will create an AWS Service Catalog service action to reboot a workspace instance.

First, create a JSON file for the Service-Action that you wish to create.

Here’s sample code for an API-driven Amazon WorkSpaces reboot

{
  "description": "Reboot WorkSpaces instances",
  "schemaVersion": "0.3",
  "assumeRole": "",
  "parameters": {
    "WorkspaceId": {
      "type": "String",
      "description": "WorkspaceID- ws-xxxx"
    },
    "WPAction": {
      "type": "String",
      "description": "Action",
      "default": "Reboot"
    },
    "AutomationAssumeRole": {
      "type": "String",
      "description": "(Optional) The ARN of the role that allows Automation to perform the actions on your behalf.",
      "default": ""
    }
  },
  "mainSteps": [
    {
      "name": "wpreboot",
      "action": "aws:executeAwsApi",
      "inputs": {
        "Service": "workspaces",
        "Api": "RebootWorkspaces",
        "RebootWorkspaceRequests": [
          {
            "WorkspaceId": ""
          }
        ]
      },
      "isEnd": "True"
    }
  ]
}

After you create this file, execute the AWS CLI command to build the automation document and link it to Amazon WorkSpaces.

 

Note

Complete this task in the AWS CLI to enable the AWS::WorkSpaces::Workspace target.

In this example, the file is named wpreboot.json to create an automation document called wpreboot. Run the following command:

C:\ssm>aws ssm create-document –content file://c:\ssm\wpreboot.json –name wpreboot –document-type Automation –target /AWS::WorkSpaces::Workspace

Test this action in Systems Manager to ensure that it’s working as expected.

Next, add the automation document to a new AWS Service Catalog self-service action.  Instuctions can be found at: https://docs.aws.amazon.com/servicecatalog/latest/adminguide/using-service-actions.html  Once completed, you should have the service actions associated with your Amazon WorkSpaces product similar to the following example.

 

In the ServiceNow portal, you should now have this “reboot” option associated with your product as shown in the following example.

 

Adding ServiceNow Workflows

As a final step, you will build ServiceNow Workflows to allow you to add approvals, notifications, open change records, and other organizational-based requirements before an order is approved.

The AWS Service Catalog connector for ServiceNow contains the following Workflows that you can use as a starting point. The workflows should be updated to meet the needs of your organization.

·         AWS Service Catalog – Approve Change Request

·         AWS Service Catalog – Execute Provisioned Product Action

·         AWS Service Catalog – Invoke Workflow Task

·         AWS Service Catalog – Provision Product Request

·         AWS Service Catalog – Track Product record

 

Summary

Integrating AWS Service Catalog with ServiceNow gives end users the ability to create a self-service lifecycle-management solution for Amazon WorkSpaces in a familiar, secure, ITSM-aligned process. With the addition of Service Actions, enterprises can add additional operational capabilities such as the ability to upgrade, reboot, repair or install software to their Amazon WorkSpace from within the ServiceNow Portal.

 

About the author

Alan DeLucia is a New York based Business Development Manager with AWS Service Catalog and AWS Control Tower. Alan enjoys helping customers build management capabilities and governance into their AWS solutions. In his free time, Alan is an avid Mountain Biker and enjoys spending time and vacationing with his family.

from AWS Management Tools Blog

Using the AWS Config Auto Remediation feature for Amazon S3 bucket compliance

Using the AWS Config Auto Remediation feature for Amazon S3 bucket compliance

AWS Config keeps track of the configuration of your AWS resources and their relationships to your other resources. It can also evaluate those AWS resources for compliance. This service uses rules that can be configured to evaluate AWS resources against desired configurations.

For example, there are AWS Config rules that check whether or not your Amazon S3 buckets have logging enabled or your IAM users have an MFA device enabled. AWS Config rules use AWS Lambda functions to perform the compliance evaluations, and the Lambda functions return the compliance status of the evaluated resources as compliant or noncompliant. The noncompliant resources are remediated using the remediation action associated to the AWS Config rule. With Auto Remediation feature of AWS Config rules, the remediation action can be executed automatically when a resource is found non-compliant.

Until now, remediation actions had to be executed manually for each noncompliant resource. This is not always feasible if you have many noncompliant resources for which you want to execute remediation actions. It can also pose risks if these resources remain without remediation for an extended amount of time.

In this post, you learn how to use the new AWS Config Auto Remediation feature on a noncompliant S3 bucket to ensure it is remediated automatically.

Overview

The AWS Config Auto Remediation feature automatically remediates non-compliant resources evaluated by AWS Config rules. You can associate remediation actions with AWS Config rules and choose to execute them automatically to address non-compliant resources without manual intervention.

You can:

  • Choose the remediation action you want to associate from a pre populated list.
  • Create your own custom remediation actions using AWS Systems Manager Automation documents.

If a resource is still non-compliant after auto remediation, you can set the rule to try auto remediation again.

Solution

This post describes how to use the AWS Config Auto Remediation feature to auto remediate any non-compliant S3 buckets using the following AWS Config rules:

  • s3-bucket-logging-enabled
  • s3-bucket-server-side-encryption-enable
  • s3-bucket-public-write-prohibited
  • s3-bucket-public-read-prohibited

These AWS Config rules act as controls to prevent any non-compliant S3 activities.

Prerequisites

Make sure you have the following prerequisites before following the solution in this post:

  • You must have AWS Config enabled in your AWS account. For more information, see Getting Started with AWS Config.
  • The AutomationAssumeRole in the remediation action parameters should be assumable by SSM. The user must have pass-role permissions for that role when they create the remediation action in AWS Config, and that role must have whatever permissions the SSM document requires. For example, it may need “s3:PutEncryptionConfiguration” or something else specific to the API call that SSM uses.
  • (Optional): While setting up remediation action, if you want to pass the resource ID of non-compliant resources to the remediation action, choose Resource ID parameter. If selected, at runtime that parameter is substituted with the ID of the resource to be remediated. Each parameter has either a static value or a dynamic value. If you do not choose a specific resource ID parameter from the drop-down list, you can enter values for each key. If you choose a resource ID parameter from the drop-down list, you can enter values for all the other keys except the selected resource ID parameter.

Steps

Use the following steps to set up Auto Remediation for each of the four AWS Config rules.

To set up Auto Remediation for s3-bucket-logging-enabled

The “s3-bucket-logging-enabled” AWS Config rule checks whether logging is enabled for your S3 buckets. Use the following steps to auto-remediate an S3 bucket whose logging is not enabled:

  1. Sign in to the AWS Management Console and open the AWS Config console.
  2. On the left pane, choose Rules
  3. On the Rules page, under Rule name, select s3-bucket-logging-enabled and then choose Add rule to add it to the rule list. (If the rule already exists, select it from the rule list and then choose Edit.)  There is one bucket named “tests3loggingnotenabled” which shows as a non-compliant resource under “s3-bucket-logging-enabled” rule.
  4. Return to the Rules page and choose Edit.
  5. In the Choose remediation action section, from the Remediation action list, select AWS-ConfigureS3BucketLogging. (AWS-ConfigureS3BucketLogging is an AWS SSM Automation document that enables logging on an S3 bucket using SSM Automation.)
  6. In the Auto remediation section, select Yes to automatically remediate non-compliant resources.
  7. In the Parameters section, enter the values for the required parameters such as AutomationAssumeRole, Grantee details required to execute the remediation action, and the Target bucket to store logs.
  8. Choose Save. The “s3-bucket-logging-enabled” AWS Config rule can now auto-remediate non-compliant resources. A confirmation that it executed the remediation action shows in the Action status column.S3 bucket Server access logging is now enabled automatically using the AWS Config Auto Remediation feature.

To set up Auto Remediation for s3-bucket-server-side-encryption-enable

The “s3-bucket-server-side-encryption-enabled” AWS Config rule checks that your S3 bucket either has S3 default encryption enabled or that the S3 bucket policy explicitly denies put-object requests without server side encryption.

  1. Sign in to the AWS Management Console and open the AWS Config console 
  2. On the left pane, choose Rules
  3. On the Rules page, under Rule name, select s3-bucket-server-side-encryption-enabled and then choose Add rule to add it to the rule list. (If the rule already exists, select it from the rule list and then choose Edit.)There is one S3 bucket named “s3notencrypted” which is shown as a non-compliant resource under “s3-bucket-server-side-encryption-enabled” rule.
  4. Return to the Rules page and choose Edit.
  5. In the Choose remediation action section, from the Remediation action list, select AWS-EnableS3BucketEncryption. (AWS-EnableS3BucketEncryption is an AWS SSM Automation document that enables server-side encryption on an S3 bucket using SSM Automation. )
  6. In the Auto remediation section, select Yes to automatically remediate non-compliant resources.
  7. In the Parameters section, enter the values for AutomationAssumeRole, SSE algorithm required to execute the remediation action.
  8. Choose Save. The “s3-bucket-server-side-encryption-enabled” AWS Config rule can now auto-remediate non-compliant resources. A confirmation that it executed the remediation action shows in the Action status column.S3 bucket server-side encryption is now enabled automatically using the AWS Config Auto Remediation feature.

To set up auto remediation for s3-bucket-public-read-prohibited and s3-bucket-public-write-prohibited

An AWS S3 bucket can be protected from public read and write using AWS Config rules “s3-bucket-public-read-prohibited” and “s3-bucket-public-write-prohibited” respectively. Enable these AWS Config rules as discussed in the above two scenarios and enable auto remediation feature with existing SSM Document remediation action “AWSDisableS3BucketPublicReadWrite”. This remediation action disables an S3 bucket’s public Write and Read access via private ACL.

Conclusion

In this post, you saw how to auto-remediate non-compliant S3 resources using the AWS Config auto remediation feature for AWS Config rules. You can also use this feature to maintain compliance of other AWS resources using existing SSM documents or custom SSM documents. For more details, see Remediating Non-compliant AWS Resources by AWS Config Rules.

For pricing details on AWS Config rules, visit the AWS Config pricing page.

 

About the Author

Harshitha Putta is an Associate Consultant with AWS Professional Services in Seattle, WA. She is passionate about building innovative solutions using AWS services to help customers achieve their business objectives. She enjoys spending time with family and friends, playing board games and hiking.

from AWS Management Tools Blog