Tag: AWS News Blog

Learn about AWS Services & Solutions – September AWS Online Tech Talks

Learn about AWS Services & Solutions – September AWS Online Tech Talks

Learn about AWS Services & Solutions – September AWS Online Tech Talks

AWS Tech Talks

Join us this September to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

 

Compute:

September 23, 2019 | 11:00 AM – 12:00 PM PTBuild Your Hybrid Cloud Architecture with AWS – Learn about the extensive range of services AWS offers to help you build a hybrid cloud architecture best suited for your use case.

September 26, 2019 | 1:00 PM – 2:00 PM PTSelf-Hosted WordPress: It’s Easier Than You Think – Learn how you can easily build a fault-tolerant WordPress site using Amazon Lightsail.

October 3, 2019 | 11:00 AM – 12:00 PM PTLower Costs by Right Sizing Your Instance with Amazon EC2 T3 General Purpose Burstable Instances – Get an overview of T3 instances, understand what workloads are ideal for them, and understand how the T3 credit system works so that you can lower your EC2 instance costs today.

 

Containers:

September 26, 2019 | 11:00 AM – 12:00 PM PTDevelop a Web App Using Amazon ECS and AWS Cloud Development Kit (CDK) – Learn how to build your first app using CDK and AWS container services.

 

Data Lakes & Analytics:

September 26, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Provisioning Amazon MSK Clusters and Using Popular Apache Kafka-Compatible Tooling – Learn best practices on running Apache Kafka production workloads at a lower cost on Amazon MSK.

 

Databases:

September 25, 2019 | 1:00 PM – 2:00 PM PTWhat’s New in Amazon DocumentDB (with MongoDB compatibility) – Learn what’s new in Amazon DocumentDB, a fully managed MongoDB compatible database service designed from the ground up to be fast, scalable, and highly available.

October 3, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Enterprise-Class Security, High-Availability, and Scalability with Amazon ElastiCache – Learn about new enterprise-friendly Amazon ElastiCache enhancements like customer managed key and online scaling up or down to make your critical workloads more secure, scalable and available.

 

DevOps:

October 1, 2019 | 9:00 AM – 10:00 AM PT – CI/CD for Containers: A Way Forward for Your DevOps Pipeline – Learn how to build CI/CD pipelines using AWS services to get the most out of the agility afforded by containers.

 

Enterprise & Hybrid:

September 24, 2019 | 1:00 PM – 2:30 PM PT Virtual Workshop: How to Monitor and Manage Your AWS Costs – Learn how to visualize and manage your AWS cost and usage in this virtual hands-on workshop.

October 2, 2019 | 1:00 PM – 2:00 PM PT – Accelerate Cloud Adoption and Reduce Operational Risk with AWS Managed Services – Learn how AMS accelerates your migration to AWS, reduces your operating costs, improves security and compliance, and enables you to focus on your differentiating business priorities.

 

IoT:

September 25, 2019 | 9:00 AM – 10:00 AM PTComplex Monitoring for Industrial with AWS IoT Data Services – Learn how to solve your complex event monitoring challenges with AWS IoT Data Services.

 

Machine Learning:

September 23, 2019 | 9:00 AM – 10:00 AM PTTraining Machine Learning Models Faster – Learn how to train machine learning models quickly and with a single click using Amazon SageMaker.

September 30, 2019 | 11:00 AM – 12:00 PM PTUsing Containers for Deep Learning Workflows – Learn how containers can help address challenges in deploying deep learning environments.

October 3, 2019 | 1:00 PM – 2:30 PM PTVirtual Workshop: Getting Hands-On with Machine Learning and Ready to Race in the AWS DeepRacer League – Join DeClercq Wentzel, Senior Product Manager for AWS DeepRacer, for a presentation on the basics of machine learning and how to build a reinforcement learning model that you can use to join the AWS DeepRacer League.

 

AWS Marketplace:

September 30, 2019 | 9:00 AM – 10:00 AM PTAdvancing Software Procurement in a Containerized World – Learn how to deploy applications faster with third-party container products.

 

Migration:

September 24, 2019 | 11:00 AM – 12:00 PM PTApplication Migrations Using AWS Server Migration Service (SMS) – Learn how to use AWS Server Migration Service (SMS) for automating application migration and scheduling continuous replication, from your on-premises data centers or Microsoft Azure to AWS.

 

Networking & Content Delivery:

September 25, 2019 | 11:00 AM – 12:00 PM PTBuilding Highly Available and Performant Applications using AWS Global Accelerator – Learn how to build highly available and performant architectures for your applications with AWS Global Accelerator, now with source IP preservation.

September 30, 2019 | 1:00 PM – 2:00 PM PTAWS Office Hours: Amazon CloudFront – Just getting started with Amazon CloudFront and [email protected]? Get answers directly from our experts during AWS Office Hours.

 

Robotics:

October 1, 2019 | 11:00 AM – 12:00 PM PTRobots and STEM: AWS RoboMaker and AWS Educate Unite! – Come join members of the AWS RoboMaker and AWS Educate teams as we provide an overview of our education initiatives and walk you through the newly launched RoboMaker Badge.

 

Security, Identity & Compliance:

October 1, 2019 | 1:00 PM – 2:00 PM PTDeep Dive on Running Active Directory on AWS – Learn how to deploy Active Directory on AWS and start migrating your windows workloads.

 

Serverless:

October 2, 2019 | 9:00 AM – 10:00 AM PTDeep Dive on Amazon EventBridge – Learn how to optimize event-driven applications, and use rules and policies to route, transform, and control access to these events that react to data from SaaS apps.

 

Storage:

September 24, 2019 | 9:00 AM – 10:00 AM PTOptimize Your Amazon S3 Data Lake with S3 Storage Classes and Management Tools – Learn how to use the Amazon S3 Storage Classes and management tools to better manage your data lake at scale and to optimize storage costs and resources.

October 2, 2019 | 11:00 AM – 12:00 PM PTThe Great Migration to Cloud Storage: Choosing the Right Storage Solution for Your Workload – Learn more about AWS storage services and identify which service is the right fit for your business.

 

 

from AWS News Blog https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-september-aws-online-tech-talks/

Learn From Your VPC Flow Logs With Additional Meta-Data

Learn From Your VPC Flow Logs With Additional Meta-Data

Flow Logs for Amazon Virtual Private Cloud enables you to capture information about the IP traffic going to and from network interfaces in your VPC. Flow Logs data can be published to Amazon CloudWatch Logs or Amazon Simple Storage Service (S3).

Since we launched VPC Flow Logs in 2015, you have been using it for variety of use-cases like troubleshooting connectivity issues across your VPCs, intrusion detection, anomaly detection, or archival for compliance purposes. Until today, VPC Flow Logs provided information that included source IP, source port, destination IP, destination port, action (accept, reject) and status. Once enabled, a VPC Flow Log entry looks like the one below.

While this information was sufficient to understand most flows, it required additional computation and lookup to match IP addresses to instance IDs or to guess the directionality of the flow to come to meaningful conclusions.

Today we are announcing the availability of additional meta data to include in your Flow Logs records to better understand network flows. The enriched Flow Logs will allow you to simplify your scripts or remove the need for postprocessing altogether, by reducing the number of computations or lookups required to extract meaningful information from the log data.

When you create a new VPC Flow Log, in addition to existing fields, you can now choose to add the following meta-data:

  • vpc-id : the ID of the VPC containing the source Elastic Network Interface (ENI).
  • subnet-id : the ID of the subnet containing the source ENI.
  • instance-id : the Amazon Elastic Compute Cloud (EC2) instance ID of the instance associated with the source interface. When the ENI is placed by AWS services (for example, AWS PrivateLink, NAT Gateway, Network Load Balancer etc) this field will be “-
  • tcp-flags : the bitmask for TCP Flags observed within the aggregation period. For example, FIN is 0x01 (1), SYN is 0x02 (2), ACK is 0x10 (16), SYN + ACK is 0x12 (18), etc. (the bits are specified in “Control Bits” section of RFC793 “Transmission Control Protocol Specification”).
    This allows to understand who initiated or terminated the connection. TCP uses a three way handshake to establish a connection. The connecting machine sends a SYN packet to the destination, the destination replies with a SYN + ACK and, finally, the connecting machine sends an ACK. In the Flow Logs, the handshake is shown as two lines, with tcp-flags values of 2 (SYN), 18 (SYN + ACK).  ACK is reported only when it is accompanied with SYN (otherwise it would be too much noise for you to filter out).
  • type : the type of traffic : IPV4, IPV6 or Elastic Fabric Adapter.
  • pkt-srcaddr : the packet-level IP address of the source. You typically use this field in conjunction with srcaddr to distinguish between the IP address of an intermediate layer through which traffic flows, such as a NAT gateway.
  • pkt-dstaddr : the packet-level destination IP address, similar to the previous one, but for destination IP addresses.

To create a VPC Flow Log, you can use the AWS Management Console, the AWS Command Line Interface (CLI) or the CreateFlowLogs API and select which additional information and the order you want to consume the fields, for example:

Or using the AWS Command Line Interface (CLI) as below:

$ aws ec2 create-flow-logs --resource-type VPC \
                            --region eu-west-1 \
                            --resource-ids vpc-12345678 \
                            --traffic-type ALL  \
                            --log-destination-type s3 \
                            --log-destination arn:aws:s3:::sst-vpc-demo \
                            --log-format '${version} ${vpc-id} ${subnet-id} ${instance-id} ${interface-id} ${account-id} ${type} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${pkt-srcaddr} ${pkt-dstaddr} ${protocol} ${bytes} ${packets} ${start} ${end} ${action} ${tcp-flags} ${log-status}'

# be sure to replace the bucket name and VPC ID !

{
    "ClientToken": "1A....HoP=",
    "FlowLogIds": [
        "fl-12345678123456789"
    ],
    "Unsuccessful": [] 
}

Enriched VPC Flow Logs are delivered to S3. We will automatically add the required S3 Bucket Policy to authorize VPC Flow Logs to write to your S3 bucket. VPC Flow Logs does not capture real-time log streams for your network interface, it might take several minutes to begin collecting and publishing data to the chosen destinations. Your logs will eventually be available on S3 at s3://<bucket name>/AWSLogs/<account id>/vpcflowlogs/<region>/<year>/<month>/<day>/

An SSH connection from my laptop with IP address 90.90.0.200 to an EC2 instance would appear like this :

3 vpc-exxxxxx2 subnet-8xxxxf3 i-0bfxxxxxxaf eni-08xxxxxxa5 48xxxxxx93 IPv4 172.31.22.145 90.90.0.200 22 62897 172.31.22.145 90.90.0.200 6 5225 24 1566328660 1566328672 ACCEPT 18 OK
3 vpc-exxxxxx2 subnet-8xxxxf3 i-0bfxxxxxxaf eni-08xxxxxxa5 48xxxxxx93 IPv4 90.90.0.200 172.31.22.145 62897 22 90.90.0.200 172.31.22.145 6 4877 29 1566328660 1566328672 ACCEPT 2 OK

172.31.22.145 is the private IP address of the EC2 instance, the one you see when you type ifconfig on the instance.  All flags are “OR”ed during aggregation period. When connection is short, probably both SYN and FIN (3), as well as SYN+ACK and FIN (19) will be set for the same lines.

Once a Flow Log is created, you can not add additional fields or modify the structure of the log to ensure you will not accidently break scripts consuming this data. Any modification will require you to delete and recreate the VPC Flow Logs. There is no additional cost to capture the extra information in the VPC Flow Logs, normal VPC Flow Log pricing applies, remember that Enriched VPC Flow Log records might consume more storage when selecting all fields.  We do recommend to select only the fields relevant to your use-cases.

Enriched VPC Flow Logs is available in all regions where VPC Flow logs is available, you can start to use it today.

— seb

PS: I heard from the team they are working on adding additional meta-data to the logs, stay tuned for updates.

from AWS News Blog https://aws.amazon.com/blogs/aws/learn-from-your-vpc-flow-logs-with-additional-meta-data/

Now Available – Amazon Quantum Ledger Database (QLDB)

Now Available – Amazon Quantum Ledger Database (QLDB)

Given the wide range of data types, query models, indexing options, scaling expectations, and performance requirements, databases are definitely not one size fits all products. That’s why there are many different AWS database offerings, each one purpose-built to meet the needs of a different type of application.

Introducing QLDB
Today I would like to tell you about Amazon QLDB, the newest member of the AWS database family. First announced at AWS re:Invent 2018 and made available in preview form, it is now available in production form in five AWS regions.

As a ledger database, QLDB is designed to provide an authoritative data source (often known as a system of record) for stored data. It maintains a complete, immutable history of all committed changes to the data that cannot be updated, altered, or deleted. QLDB supports PartiQL SQL queries to the historical data, and also provides an API that allows you to cryptographically verify that the history is accurate and legitimate. These features make QLDB a great fit for banking & finance, ecommerce, transportation & logistics, HR & payroll, manufacturing, and government applications and many other use cases that need to maintain the integrity and history of stored data.

Important QLDB Concepts
Let’s review the most important QLDB concepts before diving in:

Ledger – A QLDB ledger consists of a set of QLDB tables and a journal that maintains the complete, immutable history of changes to the tables. Ledgers are named and can be tagged.

Journal – A journal consists of a sequence of blocks, each cryptographically chained to the previous block so that changes can be verified. Blocks, in turn, contain the actual changes that were made to the tables, indexed for efficient retrieval. This append-only model ensures that previous data cannot be edited or deleted, and makes the ledgers immutable. QLDB allows you to export all or part of a journal to S3.

Table – Tables exist within a ledger, and contain a collection of document revisions. Tables support optional indexes on document fields; the indexes can improve performance for queries that make use of the equality (=) predicate.

Documents – Documents exist within tables, and must be in Amazon Ion form. Ion is a superset of JSON that adds additional data types, type annotations, and comments. QLDB supports documents that contain nested JSON elements, and gives you the ability to write queries that reference and include these elements. Documents need not conform to any particular schema, giving you the flexibility to build applications that can easily adapt to changes.

PartiQLPartiQL is a new open standard query language that supports SQL-compatible access to relational, semi-structured, and nested data while remaining independent of any particular data source. To learn more, read Announcing PartiQL: One Query Languge for All Your Data.

Serverless – You don’t have to worry about provisioning capacity or configuring read & write throughput. You create a ledger, define your tables, and QLDB will automatically scale to meet the needs of your application.

Using QLDB
You can create QLDB ledgers and tables from the AWS Management Console, AWS Command Line Interface (CLI), a CloudFormation template, or by making calls to the QLDB API. I’ll use the QLDB Console and I will follow the steps in Getting Started with Amazon QLDB. I open the console and click Start tutorial to get started:

The Getting Started page outlines the first three steps; I click Create ledger to proceed (this opens in a fresh browser tab):

I enter a name for my ledger (vehicle-registration), tag it, and (again) click Create ledger to proceed:

My ledger starts out in Creating status, and transitions to Active within a minute or two:

I return to the Getting Started page, refresh the list of ledgers, choose my new ledger, and click Load sample data:

This takes a second or so, and creates four tables & six indexes:

I could also use PartiQL statements such as CREATE TABLE, CREATE INDEX, and INSERT INTO to accomplish the same task.

With my tables, indexes, and sample data loaded, I click on Editor and run my first query (a single-table SELECT):

This returns a single row, and also benefits from the index on the VIN field. I can also run a more complex query that joins two tables:

I can obtain the ID of a document (using a query from here), and then update the document:

I can query the modification history of a table or a specific document in a table, with the ability to find modifications within a certain range and on a particular document (read Querying Revision History to learn more). Here’s a simple query that returns the history of modifications to all of the documents in the VehicleRegistration table that were made on the day that I wrote this post:

As you can see, each row is a structured JSON object. I can select any desired rows and click View JSON for further inspection:

Earlier, I mentioned that PartiQL can deal with nested data. The VehicleRegistration table contains ownership information that looks like this:

{
   "Owners":{
      "PrimaryOwner":{
         "PersonId":"6bs0SQs1QFx7qN1gL2SE5G"
      },
      "SecondaryOwners":[

      ]
  }

PartiQL lets me reference the nested data using “.” notation:

I can also verify the integrity of a document that is stored within my ledger’s journal. This is fully described in Verify a Document in a Ledger, and is a great example of the power (and value) of cryptographic verification. Each QLDB ledger has an associated digest. The digest is a 256-bit hash value that uniquely represents the ledger’s entire history of document revisions as of a point in time. To access the digest, I select a ledger and click Get digest:

When I click Save, the console provides me with a short file that contains all of the information needed to verify the ledger. I save this file in a safe place, for use when I want to verify a document in the ledger. When that time comes, I get the file, click on Verification in the left-navigation, and enter the values needed to perform the verification. This includes the block address of a document revision, and the ID of the document. I also choose the digest that I saved earlier, and click Verify:

QLDB recomputes the hashes to ensure that the document has not been surreptitiously changed, and displays the verification:

In a production environment, you would use the QLDB APIs to periodically download digests and to verify the integrity of your documents.

Building Applications with QLDB
You can use the Amazon QLDB Driver for Java to write code that accesses and manipulates your ledger database. This is a Java driver that allows you to create sessions, execute PartiQL commands within the scope of a transaction, and retrieve results. Drivers for other languages are in the works; stay tuned for more information.

Available Now
Amazon QLDB is available now in the US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), and Asia Pacific (Tokyo) Regions. Pricing is based on the following factors, and is detailed on the Amazon QLDB Pricing page, including some real-world examples:

  • Write operations
  • Read operations
  • Journal storage
  • Indexed storage
  • Data transfer

Jeff;

from AWS News Blog https://aws.amazon.com/blogs/aws/now-available-amazon-quantum-ledger-database-qldb/

Introducing the Newest AWS Heroes – September 2019

Introducing the Newest AWS Heroes – September 2019

Leaders within the AWS technical community educate others about the latest AWS services in a variety of ways: some share knowledge in person by speaking at events or running workshops and Meetups; while others prefer to share their insights online via social media, blogs, or open source contributions.

The most prominent AWS community leaders worldwide are recognized as AWS Heroes, and today we are excited to introduce to you the latest members of the AWS Hero program:

Alex Schultz – Fort Wayne, USA

Machine Learning Hero Alex Schultz works in the Innovation Labs at Advanced Solutions where he develops machine learning enabled products and solutions for the biomedical and product distribution industries. After receiving a DeepLens at re:Invent 2017, he dove headfirst into machine learning where he used the device to win the AWS DeepLens challenge by building a project which can read books to children. As an active advocate for AWS, he runs the Fort Wayne AWS User Group and loves to share his knowledge and experience with other developers. He also regularly contributes to the online DeepRacer community where he has helped many people who are new to machine learning get started.

 

 

 

 

 

 

 

Chase Douglas – Portland, USA

Serverless Hero Chase Douglas is the CTO and co-founder at Stackery, where he steers engineering and technical architecture of development tools that enable individuals and teams of developers to successfully build and manage serverless applications. He is a deeply experienced software architect and a long-time engineering leader focused on building products that increase development efficiency while delighting users. Chase is also a frequent conference speaker on the topics of serverless, instrumentation, and software patterns. Most recently, he discussed the serverless development-to-production pipeline at the Chicago and New York AWS Summits, and provided insight into how the future of serverless may be functionless in his blog series.

 

 

 

 

 

 

Chris Williams – Portsmouth, USA

Community Hero Chris Williams is an Enterprise Cloud Consultant for GreenPages Technology Solutions—a digital transformation and cloud enablement company. There he helps customers design and deploy the next generation of public, private, and hybrid cloud solutions, specializing in AWS and VMware. Chris blogs about virtualization, technology, and design at Mistwire. He is an active community leader, co-organizing the AWS Portsmouth User Group, and both hosts and presents on vBrownBag.

 

 

 

 

 

 

 

 

Dave Stauffacher – Milwaukee, USA

Community Hero Dave Stauffacher is a Principal Platform Engineer focused on cloud engineering at Direct Supply where he has helped navigate a 30,000% data growth over the last 15 years. In his current role, Dave is focused on helping drive Direct Supply’s cloud migration, combining his storage background with cloud automation and standardization practices. Dave has published his automation work for deploying AWS Storage Gateway for use with SQL Server data protection. He is a participant in the Milwaukee AWS User Group and the Milwaukee Docker User Group and has showcased his cloud experience in presentations at the AWS Midwest Community Day, AWS re:Invent, HashiConf, the Milwaukee Big Data User Group and other industry events.

 

 

 

 

 

 

Gojko Adzic – London, United Kingdom

Serverless Hero Gojko Adzic is a partner at Neuri Consulting LLP and a co-founder of MindMup, a collaborative mind mapping application that has been running on AWS Lambda since 2016. He is the author of the book Running Serverless and co-author of Serverless Computing: Economic and Architectural Impact, one of the first academic papers about AWS Lambda. He is also a key contributor to Claudia.js, an open-source tool that simplifies Lambda application deployment, and is a minor contributor to the AWS SAM CLI. Gojko frequently blogs about serverless application development on Serverless.Pub and his personal blog, and he has authored numerous other books.

 

 

 

 

 

 

 

Liz Rice – Enfield, United Kingdom

Container Hero Liz Rice is VP Open Source Engineering with cloud native security specialists Aqua Security, where she and her team look after several container-related open source projects. She is chair of the CNCF’s Technical Oversight Committee, and was Co-Chair of the KubeCon + CloudNativeCon 2018 events in Copenhagen, Shanghai and Seattle. She is a regular speaker at conferences including re:Invent, Velocity, DockerCon and many more. Her talks usually feature live coding or demos, and she is known for making complex technical concepts accessible.

 

 

 

 

 

 

 

 

Lyndon Leggate – London, United Kingdom

Machine Learning Hero Lyndon Leggate is a senior technology leader with extensive experience of defining and delivering complex technical solutions on large, business critical projects for consumer facing brands. Lyndon is a keen participant in the AWS DeepRacer league. Racing as Etaggel, he has regularly positioned in the top 10, features in DeepRacer TV and in May 2019 established the AWS DeepRacer Community. This vibrant and rapidly growing community provides a space for new and experienced racers to seek advice and share tips. The Community has gone on to expand the DeepRacer toolsets, making the platform more accessible and pushing the bounds of the technology. He also organises the AWS DeepRacer London Meetup series.

 

 

 

 

 

 

Maciej Lelusz – Crakow, Poland

Community Hero Maciej Lelusz is Co-Founder of Chaos Gears, a company concentrated on serverless, automation, IaC and chaos engineering as a way for the improvement of system resiliency. He is focused on community development, blogging, company management, and Public/Hybrid/Private cloud design. He cares about enterprise technology, IT transformation, its culture and people involved in it. Maciej is Co-Leader of the AWS User Group Poland – Cracow Chapter, and the Founder and Leader of the InfraXstructure conference and Polish VMware User Group.

 

 

 

 

 

 

 

 

Nathan Glover – Perth, Australia

Community Hero Nathan Glover is a DevOps Consultant at Mechanical Rock in Perth, Western Australia. Prior to that he worked as a Hardware Systems Designer and Embedded Developer in the IoT space. He is passionate about Cloud Native architecture and loves sharing his successes and failures on his blog. A key focus for him is breaking down the learning barrier by building practical examples using cloud services. On top of these he has a number of online courses teaching people how to get started building with Amazon Alexa Skills and AWS IoT. In his spare time, he loves to dabble in all areas of technology; building cloud connected toasters, embedded systems vehicle tracking, and competing in online capture the flag security events.

 

 

 

 

 

 

Prashanth HN – Bengaluru, India

Serverless Hero Prashanth HN is the Chief Technology Officer at WheelsBox and one of the community leaders of the AWS Users Group, Bengaluru. He mentors and consults other startups to embrace a serverless approach, frequently blogs about serverless topics for all skill levels including topics for beginners and advanced users on his personal blog and Amplify-related topics on the AWS Amplify Community Blog, and delivers talks about building using microservices and serverless. In a recent talk, he demonstrated how microservices patterns can be implemented using serverless. Prashanth maintains the open-source project Lanyard, a serverless agenda app for event organizers and attendees, which was well received at AWS Community Day India.

 

 

 

 

 

 

 

Ran Ribenzaft – Tel Aviv, Israel

Serverless Hero Ran Ribenzaft is the Chief Technology Officer at Epsagon, an AWS Advanced Technology Partner that specializes in monitoring and tracing for serverless applications. Ran is a passionate developer that loves sharing open-source tools to make everyone’s lives easier and writing technical blog posts on the topics of serverless, microservices, cloud, and AWS on Medium and the Epsagon blog. Ran is also dedicated to educating and growing the community around serverless, organizing Serverless meetups in SF and TLV, delivering online webinars and workshops, and frequently giving talks at conferences.

 

 

 

 

 

 

 

 

Rolf Koski – Tampere, Finland

Community Hero Rolf Koski works at Cybercom, which is an AWS Premier Partner from the Nordics headquartered in Sweden. He works as the CTO at Cybercom AWS Business Group. In his role he is both technical as well as being the thought leader in the Cloud. Rolf has been one of the leading figures at the Nordic AWS Communities as one of the community leads in Helsinki and Stockholm user groups and he initially founded and organized the first ever AWS Community Days Nordics. Rolf is professionally certified and additionally works as Well-Architected Lead doing Well-Architected Reviews for customer workloads.

 

 

 

 

 

 

 

Learn more about AWS Heroes and connect with a Hero near you by checking out the Hero website.

from AWS News Blog https://aws.amazon.com/blogs/aws/introducing-the-newest-aws-heroes-september-2019/

Optimize Storage Cost with Reduced Pricing for Amazon EFS Infrequent Access

Optimize Storage Cost with Reduced Pricing for Amazon EFS Infrequent Access

Today we are announcing a new price reduction – one of the largest in AWS Cloud history to date – when using Infrequent Access (IA) with Lifecycle Management with Amazon Elastic File System. This price reduction make it possible to optimize cost even further and automatically save up to 92% on file storage costs as your access patterns change. With this new reduced pricing you can now store and access your files natively in a file system for effectively $0.08/GB-month, as we’ll see in an example later in this post.

Amazon Elastic File System (EFS) is a low-cost, simple to use, fully managed, and cloud-native NFS file system for Linux-based workloads that can be used with AWS services and on-premises resources. EFS provides elastic storage, growing and shrinking automatically as files are created or deleted – even to petabyte scale – without disruption. Your applications always have the storage they need immediately available. EFS also includes, for free, multi-AZ availability and durability right out of the box, with strong file system consistency.

Easy Cost Optimization using Lifecycle Management
As storage grows the likelihood that a given application needs access to all of the files all of the time lessens, and access patterns can also change over time. Industry analysts such as IDC, and our own analysis of usage patterns confirms, that around 80% of data is not accessed very often. The remaining 20% is in active use. Two common drivers for moving applications to the cloud are to maximize operational efficiency and to reduce the total cost of ownership, and this applies equally to storage costs. Instead of keeping all of the data on hand on the fastest performing storage it may make sense to move infrequently accessed data into a different storage class/tier, with an associated cost reduction. Identifying this data manually can be a burden so it’s also ideal to have the system monitor access over time and perform the movement of data between storage tiers automatically, again without disruption to your running applications.

EFS Infrequent Access (IA) with Lifecycle Management provides an easy to use, cost-optimized price and performance tier suitable for files that are not accessed regularly. With the new price reduction announced today builders can now save up to 92% on their file storage costs compared to EFS Standard. EFS Lifecycle Management is easy to enable and runs automatically behind the scenes. When enabled on a file system, files not accessed according to the lifecycle policy you choose will be moved automatically to the cost-optimized EFS IA storage class. This movement is transparent to your application.

Although the infrequently accessed data is held in a different storage class/tier it’s still immediately accessible. This is one of the advantages to EFS IA – you don’t have to sacrifice any of EFS‘s benefits to get the cost savings. Your files are still immediately accessible, all within the same file system namespace. The only tradeoff is slightly higher per operation latency (double digit ms vs single digit ms — think magnetic vs SSD) for the files in the IA storage class/tier.

As an example of the cost optimization EFS IA provides let’s look at storage costs for 100 terabytes (100TB) of data. The EFS Standard storage class is currently charged at $0.30/GB-month. When it was launched in July the EFS IA storage class was priced at $0.045/GB-month. It’s now been reduced to $0.025/GB-month. As I noted earlier, this is one of the largest price drops in the history of AWS to date!

Using the 20/80 access statistic mentioned earlier for EFS IA:

  • 20% of 100TB = 20TB at $0.30/GB-month = $0.30 x 20 x 1,000 = $6,000
  • 80% of 100TB = 80TB at $0.025/GB-month = $0.025 x 80 x 1,000 = $2,000
  • Total for 100TB = $8,000/month or $0.08/GB-month. Remember, this price also includes (for free) multi-AZ, full elasticity, and strong file system consistency.

Compare this to using only EFS Standard where we are storing 100% of the data in the storage class, we get a cost of $0.30 x 100 x 1,000 = $30,000. $22,000/month is a significant saving and it’s so easy to enable. Remember too that you have control over the lifecycle policy, specifying how frequently data is moved to the IA storage tier.

Getting Started with Infrequent Access (IA) Lifecycle Management
From the EFS Console I can quickly get started in creating a file system by choosing a Amazon Virtual Private Cloud and the subnets in the Virtual Private Cloud where I want to expose mount targets for my instances to connect to.

In the next step I can configure options for the new volume. This is where I select the Lifecycle policy I want to apply to enable use of the EFS IA storage class. Here I am going to enable files that have not been accessed for 14 days to be moved to the IA tier automatically.

In the final step I simply review my settings and then click Create File System to create the volume. Easy!

A Lifecycle Management policy can also be enabled, or changed, for existing volumes. Navigating to the volume in the EFS Console I can view the applied policy, if any. Here I’ve selected an existing file system that has no policy attached and therefore is not benefiting from EFS IA.

Clicking the pencil icon to the right of the field takes me to a dialog box where I can select the appropriate Lifecycle policy, just as I did when creating a new volume.

Amazon Elastic File System IA with Lifecycle Management is available now in all regions where Elastic File System is present.

— Steve

 

from AWS News Blog https://aws.amazon.com/blogs/aws/optimize-storage-cost-with-reduced-pricing-for-amazon-efs-infrequent-access/

Operational Insights for Containers and Containerized Applications

Operational Insights for Containers and Containerized Applications

The increasing adoption of containerized applications and microservices also brings an increased burden for monitoring and management. Builders have an expectation of, and requirement for, the same level of monitoring as would be used with longer lived infrastructure such as Amazon Elastic Compute Cloud (EC2) instances. By contrast containers are relatively short-lived, and usually subject to continuous deployment. This can make it difficult to reliably collect monitoring data and to analyze performance or other issues, which in turn affects remediation time. In addition builders have to resort to a disparate collection of tools to perform this analysis and inspection, manually correlating context across a set of infrastructure and application metrics, logs, and other traces.

Announcing general availability of Amazon CloudWatch Container Insights
At the AWS Summit in New York this past July, Amazon CloudWatch Container Insights support for Amazon ECS and AWS Fargate was announced as an open preview for new clusters. Starting today Container Insights is generally available, with the added ability to now also monitor existing clusters. Immediate insights into compute utilization and failures for both new and existing cluster infrastructure and containerized applications can be easily obtained from container management services including Kubernetes, Amazon Elastic Container Service for Kubernetes, Amazon ECS, and AWS Fargate.

Once enabled Amazon CloudWatch discovers all of the running containers in a cluster and collects performance and operational data at every layer in the container stack. It also continuously monitors and updates as changes occur in the environment, simplifying the number of tools required to collect, monitor, act, and analyze container metrics and logs giving complete end to end visibility.

Being able to easily access this data means customers can shift focus to increased developer productivity and away from building mechanisms to curate and build dashboards.

Getting started with Amazon CloudWatch Container Insights
I can enable Container Insights by following the instructions in the documentation. Once enabled and new clusters launched, when I visit the CloudWatch console for my region I see a new option for Container Insights in the list of dashboards available to me.

Clicking this takes me to the relevant dashboard where I can select the container management service that hosts the clusters that I want to observe.

In the below image I have selected to view metrics for my ECS Clusters that are hosting a sample application I have deployed in AWS Fargate. I can examine the metrics for standard time periods such as 1 hour, 3 hours, etc but can also specify custom time periods. Here I am looking at the metrics for a custom time period of the past 15 minutes.

You can see that I can quickly gain operational oversight of the overall performance of the cluster. Clicking the cluster name takes me deeper to view the metrics for the tasks inside the cluster.

Selecting a container allows me to then dive into either AWS X-Ray traces or performance logs.

Selecting performance logs takes me to the Amazon CloudWatch Logs Insights page where I can run queries against the performance events collected for my container ecosystem (e.g., Container, Task/Pod, Cluster, etc.) that I can then use to troubleshoot and dive deeper.

Container Insights makes it easy for me to get started monitoring my containers and enables me to quickly drill down into performance metrics and log analytics without the need to build custom dashboards to curate data from multiple tools. Beyond monitoring and troubleshooting I can also use the data and dashboards Container Insights provides me to support other use cases such as capacity requirements and planning, by helping me understand compute utilization by Pod/Task, Container, and Service for example.

Availability
Amazon CloudWatch Container Insights is generally available today to customers in all public AWS regions where Amazon Elastic Container Service for Kubernetes, Kubernetes, Amazon ECS, and AWS Fargate are present.

— Steve

from AWS News Blog https://aws.amazon.com/blogs/aws/operational-insights-for-containers-and-containerized-applications/

New – Port Forwarding Using AWS System Manager Sessions Manager

New – Port Forwarding Using AWS System Manager Sessions Manager

I increasingly see customers adopting the immutable infrastructure architecture pattern: they rebuild and redeploy an entire infrastructure for each update. They very rarely connect to servers over SSH or RDP to update configuration or to deploy software updates. However, when migrating existing applications to the cloud, it is common to connect to your Amazon Elastic Compute Cloud (EC2) instances to perform a variety of management or operational tasks. To reduce the surface of attack, AWS recommends using a bastion host, also known as a jump host. This special purpose EC2 instance is designed to be the primary access point from the Internet and acts as a proxy to your other EC2 instances. To connect to your EC2 instance, you first SSH / RDP into the bastion host and, from there, to the destination EC2 instance.

To further reduce the surface of attack, the operational burden to manage bastion hosts and the additional costs incurred, AWS Systems Manager Session Manager allows you to securely connect to your EC2 instances, without the need to run and to operate your own bastion hosts and without the need to run SSH on your EC2 instances. When Systems Manager‘s Agent is installed on your instances and when you have IAM permissions to call Systems Manager API, you can use the AWS Management Console or the AWS Command Line Interface (CLI) to securely connect to your Linux or Windows EC2 instances.

Interactive shell on EC2 instances is not the only use case for SSH. Many customers are also using SSH tunnel to remotely access services not exposed to the public internet. SSH tunneling is a powerful but lesser known feature of SSH that alows you to to create a secure tunnel between a local host and a remote service. Let’s imagine I am running a web server for easy private file transfer between an EC2 instance and my laptop. These files are private, I do not want anybody else to access that web server, therefore I configure my web server to bind only on 127.0.0.1 and I do not add port 80 to the instance’ security group. Only local processes can access the web server. To access the web server from my laptop, I create a SSH tunnel between the my laptop and the web server, as shown below

This command tells SSH to connect to instance as user ec2-user, open port 9999 on my local laptop, and forward everything from there to localhost:80 on instance. When the tunnel is established, I can point my browser at http://localhost:9999 to connect to my private web server on port 80.

Today, we are announcing Port Forwarding for AWS Systems Manager Session Manager. Port Forwarding allows you to securely create tunnels between your instances deployed in private subnets, without the need to start the SSH service on the server, to open the SSH port in the security group or the need to use a bastion host.

Similar to SSH Tunnels, Port Forwarding allows you to forward traffic between your laptop to open ports on your instance. Once port forwarding is configured, you can connect to the local port and access the server application running inside the instance. Systems Manager Session Manager’s Port Forwarding use is controlled through IAM policies on API access and the Port Forwarding SSM Document. These are two different places where you can control who in your organisation is authorised to create tunnels.

To experiment with Port Forwarding today, you can use this CDK script to deploy a VPC with private and public subnets, and a single instance running a web server in the private subnet. The drawing below illustrates the infrastructure that I am using for this blog post.

The instance is private, it does not have a public IP address, nor a DNS name. The VPC Default Security Group does not authorise connection over SSH. The Systems Manager‘s Agent, running on your EC2 instance, must be able to communicate with the Systems Manager‘ Service Endpoint. The private subnet must therefore have a routing table to a NAT Gateway or you must configure an AWS Private Link to do so.

Let’s use Systems Manager Session Manager Port Forwarding to access the web server running on this private instance.

Before doing so, you must ensure the following prerequisites are met on the EC2 instance:

  • System Manager Agent must be installed and running (version 2.3.672.0 or more recent, see instructions for Linux or Windows). The agent is installed and started by default on Amazon Linux 1 & 2, Windows and Ubuntu AMIs provided by Amazon (see this page for the exact versions), no action is required when you are using these.
  • the EC2 instance must have an IAM role with permission to invoke Systems Manager API. For this example, I am using AmazonSSMManagedInstanceCore.

On your laptop, you must:

Once the prerequisites are met, you use the AWS Command Line Interface (CLI) to create the tunnel (assuming you started the instance using the CDK script to create your instance) :

# find the instance ID based on Tag Name
INSTANCE_ID=$(aws ec2 describe-instances \
               --filter "Name=tag:Name,Values=CodeStack/NewsBlogInstance" \
               --query "Reservations[].Instances[?State.Name == 'running'].InstanceId[]" \
               --output text)
# create the port forwarding tunnel
aws ssm start-session --target $INSTANCE_ID \
                       --document-name AWS-StartPortForwardingSession \
                       --parameters '{"portNumber":["80"],"localPortNumber":["9999"]}'

Starting session with SessionId: sst-00xxx63
Port 9999 opened for sessionId sst-00xxx63
Connection accepted for session sst-00xxx63.

You can now point your browser to port 9999 and access your private web server. Type ctrl-c to terminate the port forwarding session.

The Session Manager Port Forwarding creates a tunnel similar to SSH tunneling, as illustrated below.

Port Forwarding works for Windows and Linux instances. It is available in every public AWS region today, at no additional cost when connecting to EC2 instances, you will be charged for the outgoing bandwidth from the NAT Gateway or your VPC Private Link.

— seb

from AWS News Blog https://aws.amazon.com/blogs/aws/new-port-forwarding-using-aws-system-manager-sessions-manager/

New – Client IP Address Preservation for AWS Global Accelerator

New – Client IP Address Preservation for AWS Global Accelerator

AWS Global Accelerator is a network service that routes incoming network traffic to multiple AWS regions in order to improve performance and availability for your global applications. It makes use of our collection of edge locations and our congestion-free global network to direct traffic based on application health, network health, and the geographic locations of your users, and provides a set of static Anycast IP addresses that are announced from multiple AWS locations (read New – AWS Global Accelerator for Availability and Performance to learn a lot more). The incoming TCP or UDP traffic can be routed to an Application Load Balancer, Network Load Balancer, or to an Elastic IP Address.

Client IP Address Preservation
Today we are announcing an important new feature for AWS Global Accelerator. If you are routing traffic to an Application Load Balancer, the IP address of the user’s client is now available to code running on the endpoint. This allows you to apply logic that is specific to a particular IP address. For example, you can use security groups that filter based on IP address, and you can serve custom content to users based on their IP address or geographic location. You can also use the IP addresses to collect more accurate statistics on the geographical distribution of your user base.

Using Client IP Address Preservation
If you are already using AWS Global Accelerator, we recommend that you phase in your use of Client IP Address Preservation by using weights on the endpoints. This will allow you to verify that any rules or systems that make use of IP addresses continue to function as expected.

In order to test this new feature, I launched some EC2 instances, set up an Application Load Balancer, put the instances into a target group, and created an accelerator in front of my ALB:

I checked the IP address of my browser:

I installed a simple Python program (courtesy of the Global Accelerator team), sent an HTTP request to one of the Global Accelerator’s IP addresses, and captured the output:

The Source (99.82.172.36) is an internal address used by my accelerator. With my baseline established and everything working as expected, I am now ready to enable Client IP Address Preservation!

I open the AWS Global Accelerator Console, locate my accelerator, and review the current configuration, as shown above. I click the listener for port 80, and click the existing endpoint group:

From there I click Add endpoint, add a new endpoint to the group, use a Weight of 255, and select Preserve client IP address:

My endpoint group now has two endpoints (one with client IP preserved, and one without), both of which point to the same ALB:

In a production environment I would start with a low weight and test to make sure that any security groups or other logic that was dependent on IP addresses continue to work as expected (I can also use the weights to manage traffic during blue/green deployments and software updates). Since I’m simply testing, I can throw caution to the wind and delete the old (non-IP-preserving) endpoint. Either way, the endpoint change becomes effective within a couple of minutes, and I can refresh my test window:

Now I can see that my code has access to the IP address of the browser (via the X-Forwarded-For header) and I can use it as desired. I can also use this IP address in security group rules.

To learn more about best practices for switching over, read Transitioning Your ALB Endpoints to Use Client IP Address Preservation.

Things to Know
Here are a couple of important things to know about client IP preservation:

Elastic Network Interface (ENI) Usage – The Global Accelerator creates one ENI for each subnet that contains IP-preserving endpoints, and will delete them when they are no longer required. Don’t edit or delete them.

Security Groups – The Global Accelerator creates and manages a security group named GlobalAccelerator. Again, you should not edit or delete it.

Available Now
You can enable this new feature for Application Load Balancers in the US East (N. Virginia), US East (Ohio), US West (Oregon), US West (N. California), Europe (Ireland), Europe (Frankfurt), Europe (London), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Seoul), Asia Pacific (Mumbai), and Asia Pacific (Sydney) Regions.

Jeff;

from AWS News Blog https://aws.amazon.com/blogs/aws/new-client-ip-address-preservation-for-aws-global-accelerator/

Managed Spot Training: Save Up to 90% On Your Amazon SageMaker Training Jobs

Managed Spot Training: Save Up to 90% On Your Amazon SageMaker Training Jobs

Amazon SageMaker is a fully-managed, modular machine learning (ML) service that enables developers and data scientists to easily build, train, and deploy models at any scale. With a choice of using built-in algorithms, bringing your own, or choosing from algorithms available in AWS Marketplace, it’s never been easier and faster to get ML models from experimentation to scale-out production.

One of the key benefits of Amazon SageMaker is that it frees you of any infrastructure management, no matter the scale you’re working at. For instance, instead of having to set up and manage complex training clusters, you simply tell Amazon SageMaker which Amazon Elastic Compute Cloud (EC2) instance type to use, and how many you need: the appropriate instances are then created on-demand, configured, and terminated automatically once the training job is complete. As customers have quickly understood, this means that they will never pay for idle training instances, a simple way to keep costs under control.

Introducing Managed Spot Training
Going one step further, we’re extremely happy to announce Managed Spot Training for Amazon SageMaker, a new feature based on Amazon EC2 Spot Instances that will help you lower ML training costs by up to 90% compared to using on-demand instances in Amazon SageMaker. Launched almost 10 years ago, Spot Instances have since been one of the cornerstones of building scalable and cost-optimized IT platforms on AWS. Starting today, not only will your Amazon SageMaker training jobs run on fully-managed infrastructure, they will also benefit from fully-managed cost optimization, letting you achieve much more with the same budget. Let’s dive in!

Managed Spot Training is available in all training configurations:

Setting it up is extremely simple, as it should be when working with a fully-managed service:

  • If you’re using the console, just switch the feature on.
  • If you’re working with the Amazon SageMaker SDK, just set the train_use_spot_instances to true in the Estimator constructor.

That’s all it takes: do this, and you’ll save up to 90%. Pretty cool, don’t you think?

Interruptions and Checkpointing
There’s an important difference when working with Managed Spot Training. Unlike on-demand training instances that are expected to be available until a training job completes, Managed Spot Training instances may be reclaimed at any time if we need more capacity.

With Amazon Elastic Compute Cloud (EC2) Spot Instances, you would receive a termination notification 2 minutes in advance, and would have to take appropriate action yourself. Don’t worry, though: as Amazon SageMaker is a fully-managed service, it will handle this process automatically, interrupting the training job, obtaining adequate spot capacity again, and either restarting or resuming the training job. This makes Managed Spot Training particularly interesting when you’re flexible on job starting time and job duration. You can also use the MaxWaitTimeInSeconds parameter to control the total duration of your training job (actual training time plus waiting time).

To avoid restarting a training job from scratch should it be interrupted, we strongly recommend that you implement checkpointing, a technique that saves the model in training at periodic intervals. Thanks to this, you can resume a training job from a well-defined point in time, continuing from the most recent partially trained model:

  • Built-in frameworks and custom models: you have full control over the training code. Just make sure that you use the appropriate APIs to save model checkpoints to Amazon Simple Storage Service (S3) regularly, using the location you defined in the CheckpointConfig parameter and passed to the SageMaker Estimator. Please note that TensorFlow uses checkpoints by default. For other frameworks, you’ll find examples in our sample notebooks and in the documentation.
  • Built-in algorithms: computer vision algorithms support checkpointing (Object Detection, Semantic Segmentation, and very soon Image Classification). As they tend to train on large data sets and run for longer than other algorithms, they have a higher likelihood of being interrupted. Other built-in algorithms do not support checkpointing for now.

Alright, enough talk, time for a quick demo!

Training a Built-in Object Detection Model with Managed Spot Training
Reading from this sample notebook, let’s use the AWS console to train the same job with Managed Spot Training instead of on-demand training. As explained before, I only need to take care of two things:

  • Enable Managed Spot Training (obviously).
  • Set MaxWaitTimeInSeconds.

First, let’s name our training job, and make sure it has appropriate AWS Identity and Access Management (IAM) permissions (no change).

Then, I select the built-in algorithm for object detection.

Then, I select the instance count and instance type for my training job, making sure I have enough storage for the checkpoints.

The next step is to set hyper parameters, and I’ll use the same ones as in the notebook. I then define the location and properties of the training data set.

I do the same for the validation data set.

I also define where model checkpoints should be saved. This is where Amazon SageMaker will pick them up to resume my training job should it be interrupted.

This is where the final model artifact should be saved.

Good things come to those who wait! This is where I enable Managed Spot Training, configuring a very relaxed 48 hours of maximum wait time.

I’m done, let’s train this model. Once training is complete, cost savings are clearly visible in the console.

As you can see, my training job ran for 2423 seconds, but I’m only billed for 837 seconds, saving 65% thanks to Managed Spot Training! While we’re on the topic, let me explain how pricing works.

Pricing
A Managed Spot training job is priced for the duration for which it ran before it completed, or before it was terminated.

For built-in algorithms and AWS Marketplace algorithms that don’t use checkpointing, we’re enforcing a maximum training time of 60 minutes (MaxWaitTimeInSeconds parameter).

Last but not least, no matter how many times the training job restarts or resumes, you only get charged for data download time once.

Now Available!
This new feature is available in all regions where Amazon SageMaker is available, so don’t wait and start saving now!

As always, we’d love to hear your feedback: please post it to the AWS forum for Amazon SageMaker, or send it through your usual AWS contacts.

Julien;

from AWS News Blog https://aws.amazon.com/blogs/aws/managed-spot-training-save-up-to-90-on-your-amazon-sagemaker-training-jobs/

Amazon Transcribe Now Supports Mandarin and Russian

Amazon Transcribe Now Supports Mandarin and Russian

As speech is central to human interaction, artificial intelligence research has long focused on speech recognition, the first step in designing and building systems allowing humans to interact intuitively with machines. The diversity in languages, accents and voices makes this an incredibly difficult problem, requiring expert skills, extremely large data sets, and vast amounts of computing power to train efficient models.

In order to help organizations and developers use speech recognition in their applications, we launched Amazon Transcribe at AWS re:Invent 2017, an automatic speech recognition service. Thanks to Amazon Transcribe, customers such as VideoPeel, Echo360, or GE Appliances have been able to quickly and easily add speech recognition capabilities to their applications and devices.

A single API call is all that it takes… and you don’t need to know the first thing about machine learning. You can analyze audio files stored in Amazon Simple Storage Service (S3) and have the service return a text file of the transcribed speech. You can also send a live audio stream to Amazon Transcribe and receive a stream of transcripts in real time.

Since launch, the team has constantly added new languages, and today we are happy to announce support for Mandarin and Russian, bringing the total number of supported languages to 16.

Introducing Mandarin
Working with Amazon Transcribe is extremely simple: let me show you how to get started in just a few minutes.

Let’s try Mandarin first. Starting from this Little Red Riding Hood video, I extracted the audio track, saved it in MP3 format, and uploaded it to one of my Amazon Simple Storage Service (S3) buckets. Here’s the actual file.

Then, I started a transcription job using the AWS CLI:

$ aws transcribe start-transcription-job--media MediaFileUri=https://s3-us-west-2.amazonaws.com/jsimon-transcribe-demo/little_red_riding_hood-mandarin.mp3 --media-format mp3 --language-code zh-CN --transcription-job-name little_red_riding_hood-mandarin

After a few minutes, the job is complete. Looking at the AWS console, I can either download it using the URL provided by Amazon Transcribe, or read it directly.

Unfortunately, I don’t speak Mandarin, but using Amazon Translate, this text is about a sick grandmother and a big bad wolf, so it looks like Amazon Transcribe did its job!

Introducing Russian
Let’s try Russian now, using the dialogue in this short video.

Здравствуйте! Greetings!
Добрый день! Good day!
Давайте познакомимся. Меня зовут Слава. Let’s introduce ourselves. My name is Slava.
Очень приятно, а меня – Наташа. Nice to meet you, and mine – Natasha.
Наташа, кто вы по профессии? Natasha, what is your profession?
Я врач. А вы? I (am a) doctor. And you?
Я инженер. I (am an) engineer.

This time, I will ask Amazon Transcribe to perform speaker identification too.

$ aws transcribe start-transcription-job --media MediaFileUri=https://s3-us-west-2.amazonaws.com/jsimon-transcribe-demo/russian-dialogue.mp3 --media-format mp3 --language-code ru-RU --transcription-job-name russian_dialogue --settings ShowSpeakerLabels=true,MaxSpeakerLabels=2

Here is the result.

As you can see, not only has Amazon Transcribe faithfully converted speech to text, it has also correctly assigned each sentence to the correct speaker.

Now Available!
You can start using these two new languages today in the following regions:

  • Americas: US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), AWS GovCloud (US-West), Canada (Central), South America (Sao Paulo).
  • Europe: EU (Frankfurt), EU (Ireland), EU (London), EU (Paris).
  • Asia Pacific: Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney).

The free tier covers 60 minutes for the first 12 months, starting from your first transcription request.

As always, we’d love to hear your feedback: please post it to the AWS forum for Amazon Transcribe, or send it through your usual AWS contacts.

Julien;

from AWS News Blog https://aws.amazon.com/blogs/aws/amazon-transcribe-now-supports-mandarin-and-russian/