Tag: Media

Demystifying Apple Low-Latency HTTP Live Streaming

Demystifying Apple Low-Latency HTTP Live Streaming

The latest entrant to the world of low latency over-the-top (OTT) streaming is Apple’s draft specification: Low-Latency HLS. This blog post will explore some of the features and nuances surrounding the new HTTP live streaming format and as such, and is purely informational in nature.

Background

OTT streaming, that is video streams delivered over the internet, have dramatically improved in quality of the past decade. In fact, OTT is seen as the dominating format for UltraHD experiences compared to traditional broadcast such as Satellite, Cable and Terrestrial distribution. However, one area where traditional broadcast trumps OTT is latency. Latency is the amount of time that passes between some action occurring in front of a camera lens and that action being displayed to a viewer. Traditional broadcast has this number pegged down to between 8 and 10 seconds. Historically however, OTT streams have been between 40 and 60 seconds — sometimes even higher.

The vast difference is really down to 3 components: intrinsic latency, network latency and forward buffer latency.

Intrinsic latency comes from the production of the video streams themselves. Things like contribution from location to studio, graphic compositing, commentary and more all add up to delay in the presentation before it’s even distributed to consumers. This is usually in the order of mid single digit seconds and is equal to the latency experienced in legacy broadcast means such as FM radio. Getting to this latency for OTT is seen as the holy grail.

Network latency is equal to the least efficient component in the network stack between the client and the origin. This includes things like the protocols themselves (TCP, HTTP, HLS/DASH etc), the round trip times, the bandwidth available and the CDN caching behaviors.

The forward buffer latency is how much latency your player chooses to start with. This is a good latency from a quality perspective as this latency is a buffer against adverse network conditions. The more content that exists in the forward buffer, the less likely a consumer is to experience a re-buffering event.

The problem with reducing OTT latency is that it is a complex problem with many compromises to make it work at scale. Sacrificing forward buffer means that OTT operators must be willing to also sacrifice at least some quality; improving network latency can compromise scalability and robustness while reducing intrinsic latency means a lot of complexity and cost in the production workflow.

HLS when compared to MPEG-DASH had significant drawbacks when it came to both network latency and forward buffer latency.

Because HLS clients poll a server to discover new segments, an inherent time-cost is unavoidable. If the poll request to the CDN is cached (which it almost universally is) it then translates to a latency equal to the TTL (time to live) value configured for this type of request in the CDN. What’s more is that because of the back and forth of client to server in this polling behavior, a lot of valuable time is consumed simply in the chattiness of the protocol.

On the forward buffer side, clients that are following the HLS RFC should start their playback at a point that is 3 segments behind the last available segment. If your segment duration is 6 seconds, this equates to a very healthy forward buffer of 18 seconds but in turn a massive amount of latency.

Low Latency to Date

There are already a few solutions available to date in order to reduce latency. In fact one such open source community solution is an adaption to the existing HLS specification simply named LHLS.

Another comes from the DASH Industry Forum which leverages the SegmentTemplate mode of MPEG-DASH along with MP4 Fragments with HTTP chunked transfer encoding.

Underpinning both solution is the ability for the client side player to request a segment that is still being encoded. In other words, the player is able to decode the top of the segment while the tail of the segment is still being generated. This is especially useful for if you segment duration is 6 seconds, it allows a latency of say 2 seconds or less.

Perhaps the most important part of the Apple defined Low-Latency HLS is the fact that incomplete segments should never be made available to the client. Instead the parts of the segment should be individually signaled in the playlist — I will detail this later.

Apple defined Low-Latency HLS

Low-Latency HLS, a draft specification is effectively a suite of changes predominantly to address the network latency part of the problem. In it there are four broad changes:

Importantly, the specification is fully backwards compatible with previous versions of HLS, however clients will need to be running iOS13 and above to leverage the features.

First, the server signals to the client that it has been configured to support low latency features with EXT-X-SERVER-CONTROL along with some attributes to define the exact functions available.

Next, the client can choose to send any configuration of query string parameters to modify the resulting playlist at the server side. This is a huge departure for HLS and will have long lasting ramifications in the industry if adopted and here’s why:

Historically, HLS origins and CDNs were immune to query string parameters. That is, a client could send anything in the query string of the HTTP request and it would not affect the response from server. This has led to a fairly prolific live streaming architecture whereby the encoder and/or packager will pre-publish and push m3u8 files directly to the CDN which in turn is configured to simply ignore any query strings. This meant that the m3u8 files were the same for all viewers and very easily cacheable. This is no longer the case.

Apple have reserved any query string keys beginning with _HLS. This means that in order to support Apple’s Low Latency HLS your CDN will need to include any query strings that match this pattern in the object’s cache key.

You will also need to run server-side software to modify the HLS manifest files as per the instructions of the query string parameters from the client. Apple has made a reference php library available for this purpose (note: you need to be a part of the Apple Developer Program to access this library).

Now on to the actual features:

1. Reduce Publishing Latency

If your encoder has the ability to publish sub-segments (that is either CMAF Chunks or partial TS which Apple is calling partial segments) to an origin, HLS now has the ability to discretely address these partial segments. This means that you may keep your full segment duration at the recommended 6 seconds but signal availability of parts that constitute the segment currently being created by the encoder.

This is signaled in the playlist like so:

fileSequence272.ts
#EXT-X-PART:DURATION=0.33334,URI="filePart273.0.ts",INDEPENDENT=YES
#EXT-X-PART:DURATION=0.33334,URI="filePart273.1.ts"
#EXT-X-PART:DURATION=0.33334,URI="filePart273.2.ts"
#EXT-X-PART:DURATION=0.33334,URI="filePart273.3.ts"

#EXT-X-RENDITION-REPORT:URI="../1M/waitForMSN.php",LAST-MSN=273,LAST-PART=3
#EXT-X-RENDITION-REPORT:URI="../4M/waitForMSN.php",LAST-MSN=273,LAST-PART=3

In the above example, the full segment fileSequence273.ts is not available in its entirety, however, every 333ms a new partial segment of the current bleeding edge full segment is signaled. This would require a packager that is capable of generating m3u8 updates and generating media files at least as frequently as the duration of the partial segments.

One approach might be to simply stream each segment as it is currently being encoded using HTTP1.1/2 chunked transfer encoding to an origin and write some server side AWS Lambda Functions to generate the m3u8s referencing the #EXT-X-PARTs with byte-range offset as the locations rather than discrete files as in the Apple example above.

There still exists a very big problem in that the client will still poll for segment availability. If the CDN caches this request for say 2 seconds, then having a sub-segment or partial segment duration of anything less than 2 seconds will be wasted. To this extent, the client can now send two sets of query strings in order to achieve two important things—

  • bust the cache to get a new m3u8 from origin and;
  • instruct the origin not to respond with a 404 if the partial segment is unavailable, but instead hold the connection open for up to 3x segment durations until the requested manifest sequence is available.

This is done with a combination of:

  • _HLS_msn=<N>which instructs the origin that the client is only interested in a playlist that contains the media sequence number and;
  • _HLS_part=<M>which instructs the origin that the client is only interested in a playlist that contains partof media sequence N.

For example: …/playlist.m3u8?_HLS_msn=100&_HLS_part=4 instructs the CDN and origin not to respond to this request until a playlist is available that contains part 4 of the media sequence 100. If there are 99 full segments in the playlist and 4 parts of 1 in-progress segment or more, then and only then, should the server respond.

This is a fascinating concept and quite flexible. I can see that encoders, packagers, CDNs and more will be racing to implement these features. Having a connection held open for multiple seconds however… that is something that remains to be seen at large scale. This is not a very expected behavior and there may be many lower level networking stacks that will close the connection earlier than intended. In fact, most CDNs will assume that the origin is too slow and give up if the connection is held open with no or little data transfer. This is something that will need to be comprehensively tested.

2. Eliminate Segment Round Trip

Apple have now also introduced HTTP2 Server Push into the HLS specification. Essentially, the client should pass again via query string, this time a boolean of _HLS_push=1/0, whether or not the most recent partial segment at the bottom of the m3u8 list should be pushed in parallel with the m3u8 response.

Historically, the HLS client would need to download the full m3u8 response, enumerate all the sequences and then only after that make a separate HTTP request for the segment, often wasting several hundred milliseconds in the process.

CDNs will need to support HTTP2 push and be able to intrinsically understand which object to push along side a cached m3u8 response for this feature to be of significant value.

3. Reduce Playlist Transfer Overhead

Another long standing issue with HLS Live Streams is that of perpetually growing manifest files. If you are say publishing a Test Match Cricket game that on average lasts 8 hours with 2 second segments, that will equate to 14,000 segments each with 3-4 attributes at the 8th hour. The resulting multi hundred kilobyte file, even with gzip, can take significant time to download which is compounded by the update frequency of every 2 seconds.

Imagine this issue if it were to trombone with the introduction of Partial Segments every 300ms?

That is clearly why you can now signal via the query string _HLS_skip=YES which instructs the server to only send the delta from the last playlist to now. The resulting manifest will insert #EXT-X-SKIP:SKIPPED-SEGMENTS=3 in lieu of the actual segments; in this case 3. The #SERVER-CONTROL attribute ofCAN-SKIP-UNTIL= should also be set to a horizon of no less than 6 segments from the live edge.

4. Switch Tiers Quickly

Lastly, Apple have now also introduced a new method for HLS clients to switch between representations more acutely. The representation switch process previously would require a very healthy forward buffer and the Apple HLS client would, once comfortable, start to seek out higher quality representations. If the forward buffer dropped below a certain number of seconds, the client would fall back to a lower quality representation in the hopes of increasing this forward buffer.

This logic breaks when the forward buffer is only a few hundred milliseconds as this is no time to run through the above logic before a playback stall (colloquially known as a rebuffer).

To this end, you can now include ?_HLS_report=/other/manifest.m3u8in the request. This can be used to include the segment availability hints of adjacent representations to the one currently requested. The server should then include these hints inline with the request playlist via the #EXT-X-RENDITION-REPORT:tag at the bottom of the playlist. The attributes should be at least the currently available segment media sequence id and the part number.

Switching boundaries are defined via the INDEPENDENT=YES attribute.

Additional Details

iOS Clients

Apple have provided some client-side APIs. One to control the forward buffer latency and therefore estimated latency and another to control the behavior when recovering from a stall (i.e. re-buffer event). For ultimate low latency applications, you should now instruct your iOS clients to catch up to the live edge and discard the missed frames during the re-buffer event.

The specification is preliminary stage meaning that you will need to test it thoroughly before going to production. During the presentation, Roger Pantos indicated that the specification should be rolled into the wider HLS RFC later this year.

CDN requirements

Almost no CDNs that I am aware of will cache query strings when configured for HLS delivery. This means that at least for testing, the CDN configuration should be taken out of the architecture and worked on in parallel to ensure compliance. As the concept is in Beta period anyway from Apple, your App will not make it through the App Store validation.

In Amazon CloudFront, changing the query string caching behavior is simply requires two rules to be updated:
CDN Requirements

Origin Requirements

Apple have made available a server side php script that encapsulates the server side features require for HLS Low-Latency. This is available through the Apple Developer Program.

from AWS Media Blog

AWS Media Tech Demo: Media Services Application Mapping Visualization and Monitoring

AWS Media Tech Demo: Media Services Application Mapping Visualization and Monitoring

Whether you’re managing a live event or a 24/7 channel, you want to figure out which part of your workflow needs attention as fast as possible. A highly-available, resilient live streaming workflow should keep providing viewers a working, high-quality experience, even if some of the elements that make up the workflow are degraded or failing.

This is achieved with a well-architected cloud solution which has distributed services, redundancy, and auto-scaling.

This means there is a need for a top-down view of the the entire workflow and how it’s connected. Ideally, a single error should not impact live streaming output. This makes it important to ensure there is robust monitoring and alerting across the workflow, so operational teams can be made aware of problems, and action can be taken to fix issues before any of them get worse, or multiply to the point where your audience experience is impacted.

In order to do this, time is wasted in the console matching URLs, ARNs and IDs, to figure out where an issues stemmed from.

AWS created the Media Services Application Mapper (MSAM) solution, which displays all the elements making up a live channel and the logical connections between the AWS service, media services, and other components. It provides a holistic view of a live channel and visualizes error messages, alert counts, and produces a list of confidence-ranked root causes for problematic workflows. MSAM reduces the time it takes to diagnose and troubleshoot errors by providing flexible alarm configurations and is customizable to fit the requirements of operational support teams and network operation centers.

Watch the video below to see the MSAM solution demonstration, then head over to the Solutions page to find out more.

Additional Resources:

  1. MSAM GitHub
  2. Introduction to Top-Down Monitoring of Live Video Workflows | AWS Media Blog

from AWS Media Blog

AWS for Visual Effects Explained

AWS for Visual Effects Explained

Visual effects (VFX) help drive storytelling, whether through eye-catching intergalactic showdowns or subtle CGI that transports modern viewers to bygone eras and locations. VFX artists pour countless hours into crafting and perfecting shots, often iterating on data-heavy files up until the last minute before final export. Cloud-based technology removes many of the traditional physical and logistical limitations facing VFX studios, and enables creative visionaries to instead focus on content creation.

The high availability, spiky resource demands and timebound nature of VFX make AWS the perfect solution for creating, storing, and rendering projects in the cloud. Multiple VFX studios of varied sizes use the combination of AWS render farm management software and compute resources to increase their capacity and agility at scale, and to support the creative vision of their clients – while paying only for the compute resources they use.

For example, Milk Visual Effects used the cloud to scale its render farm 10x, largely with Spot Instances, to create compute-heavy simulations that resulted in stormy ocean sequences for the feature film Adrift.” FuseFX scaled their render farm 10x with using AWS Spot Instances for “The Orville to hit tight deadlines and deliver complex VFX sequences. Untold Studios built its entire pipeline on the AWS cloud, embracing the “Studio In The Cloud” concept by leveraging AWS infrastructure. Tangent Animation rendered more than 65 percent of the animated Netflix feature “Next Gen on the AWS cloud, with more than three million render hours completed in just 30 days. Barnstorm VFX has used AWS to scale the compute power of its on-premises render farm 77x for shows like “The Man in the High Castle.” Everyday more VFX facilities request information and commence with tests using AWS cloud services such as virtual workstations, cloud rendering, and tiered storage; from immediate hot data all the way to cold S3 Glacier long term archive storage.

With any new technology adoption, especially in VFX, the proof is in the results. AWS Thinkbox is on the front lines of the shift to the cloud, helping studios architect new, more efficient and flexible ways of creating VFX with the near-infinite resources of AWS. This enables studios to scale their compute, talent, and ultimately deliver better results. Using the AWS cloud enables VFX facilities to speed up their workflows and present more iterations faster to a client. This accelerated workflow also allows facilities to fully embrace the creative process, and allows VFX facilities to take on more work and with overlapping deadlines. VFX has always been a tightly-budgeted, thin margin business, and with AWS cloud support, facilities can successfully take on more work and become more profitable since talent will be able to spend more time creating and less time waiting for results to render.

It has long been standard practice for studios to call a VFX company and check if it has “capacity” for a certain project, and it’s becoming increasingly common for studios to also ask if that company has a facility in an “incentivized location.” It is only a matter of time until studios start to inquire if the company is “cloud-enabled” and can therefore utilize near-unlimited compute resources, should a show, specific shot, or overall schedule change require it.

Will McDonald, Head of Business Development for AWS Thinkbox, recently spoke to VFX Voice about some of the most common questions he’s encountered in this transformative time. Building on his insight, here’s a closer look at how AWS is empowering studios to rethink traditional VFX workflows in favor of a more elastic alternative in the cloud.

Here is a brief overview of relevant terminology:

  • Amazon EC2 (Amazon Elastic Compute Cloud) – A scalable computing capacity in the Amazon Web Services (AWS) cloud, Amazon EC2 can be used to launch as many or as few virtual servers as needed, configure security and networking, and manage storage.
  • Instances – These are virtual computing environments that can be configured and customized with various CPU, memory, storage, and networking capacity, depending on application needs. Preconfigured templates for instances are known as Amazon Machine Images (AMIs) and they package bits needed for the server (including operating system and software). There are three types of instances: Spot Instances, On-Demand instances, or Reserved instances.
    • On-Demand instances – Instances that are paid for when needed, by the second, and can be utilized through capacity reservations. These are typically used for virtual workstations, and continuous workloads, like a license server.
    • Reserved instances – Instances that have been purchased for a certain duration, usually one to three years, and at a significant discount.
    • Spot Instances – Amazon EC2 Spot Instances let users take advantage of unused EC2 capacity in the AWS cloud. Spot Instances are available at up to a 90 percent discount compared to On-Demand prices. These are ideal for rendering due to the spiky nature of VFX rendering demands.

Unmatched Resources

The ability to scale, quickly and substantially, is a key benefit of using the cloud, and AWS capacity is more than the next ten competitors combined. AWS currently supports 66 Availability Zones within 21 Geographic Regions around the world, with announced plans for four more regions. This means AWS offers data durability and significant spare capacity that can be used via Spot Instances, resources that cost up to 90 percent less than On-Demand instances, and due to high availability, nearly all jobs rendered using Spot Instances are completed without interruption. If a frame fails for any reason, AWS Thinkbox’s Deadline compute management software will automatically reassign that frame to another render node.

The benefit of having access to the depth and computing power of the AWS cloud within the VFX marketplace is academic. Studios heavily consider the capacity of VFX facilities when determining how to award work, especially if the bid requires particularly challenging or ambitious visuals. When AWS cloud utilization has been pre-wired, those resources can be leveraged within minutes; this is in stark contrast to the weeks or even months it can take to procure and set up physical workstations and render nodes.

Security

The AWS infrastructure is built to satisfy the requirements of the most security-sensitive organizations, including the United States government. This VFX industry-specific document, “Studio Security Controls for VFX/Rendering,” was generated by Independent Security Evaluators (ISE) and outlines secure workflow requirements on AWS. Separately, AWS Thinkbox has also developed a ‘Quick Start for VFX Burst Rendering’ template that can spawn a VFX studio infrastructure on AWS that achieves the security requirements of MPAA and Disney/Marvel, and an accompanying document listing the best practices as a checklist.

Security in the AWS cloud is recognized as better than on premises. Broad security certification and accreditation, data encryption at rest and in-transit, hardware security modules, and strong physical security all contribute to a more secure way to for VFX studios to manage IT infrastructure.

Cost Efficiency

When the types of instances used in an Amazon EC2 fleet are varied, resource pools are spread out, and that makes them stronger as a whole and less likely to be disrupted. Keeping track of resource usage can be done via custom tagging and the AWS Billing Console. With custom tagging, VFX facilities are able to track specific cataloged data, such as all shots for a particular sequence, episode, series, etc. Usage thresholds can be set to trigger notifications or actions, and by reviewing historical usage data, facilities can better determine activity trends and estimate future resource demands and budget projected cloud usage at the bidding stage.

For more info on leveraging the AWS cloud for VFX, drop the AWS Thinkbox team a line at: [email protected].

from AWS Media Blog

Improving mission video system efficiency for government operations

Improving mission video system efficiency for government operations

The Enterprise Challenge exercise is an annual joint and coalition intelligence, surveillance and reconnaissance interoperability demonstration. You can read our preview of the event here.

In May at the annual Under Secretary of Defense for Intelligence USD(I)-sponsored Enterprise Challenge exercise, Amazon Web Services (AWS) demonstrated how government organizations can combine AWS and AWS Elemental services with traditional video infrastructure to improve video processing at the tactical edge.

During the event, attendees shared insights about the challenges associated with their mission video systems. To address those concerns, we’ve detailed a three-point strategy for implementing a hybrid workflow to enhance video performance, quality and resiliency in disconnected, intermittent, and limited-bandwidth environments.

1. Embrace Hybrid Full Motion Video (FMV) and Streaming

Until recently, most government video systems largely mirrored legacy commercial broadcast and IPTV designs – an HD camera “sensor system” connected to an AVC or HEVC encoder, compressing video, encapsulating using traditional MPEG-2 transport streams, and transmitting over a managed IP network. These flows involved laborious IP multicasting protocols that required comprehensive levels of management to strategically distribute content from source to destination.

Evolutions in media and entertainment (M&E) streaming technologies and the cloud offer government organizations new tools to implement hybrid cloud and on-premises architectures that support broadcast-quality video for enterprise and tactical operations. The architecture pulls from familiar commercial contribution techniques using MPEG-2 based transport streams, combined with ground or cloud-based Adaptive Bit Rate (ABR) video processing, aimed to improve stream reliability, quality, and accessibility across the enterprise. The best design practices for implementing a hybrid FMV streaming architecture are:

Contribution Techniques

Currently, MPEG-2 transport streams are still preferable; they provide predictable real-time performance from existing sensor systems for contribution quality from Air-to-Ground, Air-to-Cloud, or Ground-to-Cloud scenarios. To improve picture quality and reduce bitrate, HEVC/H.265 video compression should be considered as the video contribution format. Recommended by the Motion Imagery Standards Profile (MISP), HEVC is the successor to AVC/H.264 and provides a 2-to-1 compression advantage, enabling higher image quality for the same bitrate, or similar image quality at half the bitrate.

EC19 experimental aircraft

Figure 1 – EC19 experimental aircraft equipped with an AWS Elemental Live HEVC/H.265 contribution encoder.

Distribution Techniques

OTT streaming formats such as HTTP Live Streaming (HLS) or Dynamic Adaptive Streaming over HTTP (DASH) are superior for FMV stream distribution. They offer ABR profiles allowing the same live stream to be encoded at single or multiple renditions, ranging from high-quality, higher bitrate to low-quality, lower bitrate, depending upon the user’s device capabilities and network bandwidth. ABR in particular is a form of web streaming that allows users to stream live or video-on-demand content within a native browser and without dependencies on external thick client applications. Further, web environments consuming ABR content can automatically and dynamically adjust to the best stream quality/rendition for playback based on the device capabilities and available network bandwidth.

Figure 2 – EC19 Ground Operations Center

Figure 2 – EC19 Ground Operations Center generating live ABR FMV streaming.

2. Improve the Quality of Experience

Live video contribution over RF transmission systems are vulnerable to packet loss if not provisioned for error recovery. Dropped packets equate to data loss, which ultimately degrades the quality of experience (QoE) at the client side decoder. A poor QoE can be noticeable in the form of video pixelization, choppy playback conditions, artifacts, or freeze frames. While wireless RF transmission systems can be rapidly deployed in tactical environments, they can exhibit unpredictable behaviors when pushed to their limits due to bandwidth limitations and uncontrollable environmental conditions. Furthermore, within an RF mesh network, where multiple transmission paths exist, late and/or out-of-order IP packets may occur, compounding issues with live streams. Even wired networks can produce some degree of error, though these tend to be more predictable. Here are a few considerations to protect FMV traffic when up against unpredictable and predictable network conditions:

Wireless contribution from Air-to-Ground or Air-to-Cloud

By nature of the protocol, UDP/IP flows are unreliable, yet heavily deployed in mission systems today. Toggling these flows to Real-time Transport Protocol (RTP) can greatly improve stream resiliency. RTP was designed for real-time transfer of audio/video information over IP networks. It includes several techniques that can better facilitate media-based delivery systems including timestamps, sequence numbers, and payload types, which can be helpful in addressing common issues such as late or out-of-order packets. For more information about how the protocol operates, reference IETF RFC 3550, a transport protocol for real-time applications or MISB ST 0804.4, a real-time protocol for motion imagery and metadata.

In addition to general packet structure enhancements found in RTP, the protocol also allows the use of optional Forward Error Correction (FEC) by sending parity packets alongside the payload that allow media decoders to reassemble lost/dropped packets on-the-fly. Some common RTP FEC schemes use a two-dimensional matrix organized into columns and rows based on Exclusive-OR (XOR) operations. While the Column FEC is single dimension (1D), the use of Columns and Rows is referred to as two dimensions (2D).  Generally, the Column FEC provides good correction for consecutive or burst packet loss while Row FEC provides good correction of non-consecutive random packet loss.  When combined, Column and Row FEC provides a robust error protection scheme capable of correcting for both types of packet loss experienced in a network. There are several well-documented schemes used by M&E experts typically ranging from a total of 10%-20% payload overhead, yet offering different levels of protection. These schemes may be further reviewed in SMPTE ST 2022-1-2007 Forward Error Correction for Real-Time Video/Audio Transport Over IP Networks. Notice that differences between schemes will vary in the required overhead and latencies encountered to accomplish the recovery capability provided. For the FEC schemes to operate properly, it is crucial that the packet error rate is equal to or less than the protection scheme implemented. For application firewalls and network access lists, remember that FEC payloads are commonly carried on Media Port n+2 and Media Port n+4 for columns and rows, respectively.

Example of an AWS Elemental Live Encoder using RTP with FEC with contribution settings applied.

Figure 3. Example of an AWS Elemental Live Encoder using RTP with FEC with contribution settings applied.
Media = Port 50010
Column FEC = Port 50012
Row FEC = Port 50014

There are many ways to measure the general health and performance of a network to calculate for things like adding Forward Error Correction to your stream. One practice used by many solution architects and network engineers is the basic “ping” test, which uses a combination of ICMP control messages (Echo Request and Echo Reply) to measure specific conditions such as packet loss, latency, and throughput between a source and destination. Ping is a common command line utility found on most operating systems and when used appropriately can provide good metrics about network conditions. Ping is considered a network diagnostic tool, so be sure to consult with a network administrator before use.

Other forms of contribution encoding, such as ground-to-cloud, may be better off using more advanced packet protection methods that can guarantee media delivery. During Enterprise Challenge, for instance, the ground-based system terminated the aircraft RTP-FEC flow, then re-encapsulated it as a ground-to-cloud flow using reliable transport technology from Zixi.

The Zixi transport protocol provides added network resiliency through a combination of applied FEC and ARQ (Automatic Repeat Request) for advanced error-recovery. It is offered on the AWS Elemental MediaConnect service using ground and cloud-based link components for secure, reliable live video transport. Under this mode of operation, a provisioned latency value drives performance to build a sufficient error-recovery window. The AWS Elemental MediaConnect link should be configured to utilize approximately three times the round trip time between Ground and Cloud, although any relative provisioned latency may be set by the user. This is an excellent way to trade off minimal link latency in exchange for a more resilient contribution delivery system.

Distribution from Ground-to-User or Cloud-to-User

We’ve explored several ways to get live FMV from air-to-ground and ground-to-cloud, but what about serving the end user reliably? Unlike legacy architectures that used UDP multicast to disseminate flows to users, streaming technologies leverage the TCP/IP stack, providing a reliable connection-oriented flow and guaranteed packet delivery all the way to the user. This occurs by segmenting the UDP/IP flow into a series of smaller files, generally only a few seconds in length, and distributing them to client-side decoders based on broadly adopted HTTP techniques. Essentially, segments are downloaded by the client-side player, and precisely stitched together one after another, creating a consistent, smooth playback experience. If the client player takes too long to download a given segment or file, it automatically seeks a lower bandwidth rendition based on a master playlist or manifest file. In turn, this file instructs the client player if other renditions of the content exists and where to download them for faster playback. This is fundamentally how ABR streaming protocols work.

3. Consider New Ways to Carry Key-Length-Value (KLV) Metadata over ABR

While content may be king, in tactical missions metadata is queen. Video provides situational awareness, but it doesn’t offer insight as to where activity took place, which is where metadata comes into play. Full Motion Video streams generally contain associated geospatial metadata found in a data encoding standard known as KLV (Key-Length-Value) specified in SMPTE ST 336M Data Encoding Protocol Using Key-Length-Value. The underlying metadata, outlined in MISB ST 0601 UAS Datalink Local Set, contains specific location information about the video from the camera sensor or aircraft perspective (e.g. latitude, longitude, altitude, etc.).

FMV streams are generally constructed as programs within an MPEG-2 transport stream, with each program containing all of the information needed by the decoder application to make up a channel. In any given FMV channel, the program would consist of an elementary stream for video and an elementary stream for KLV. Transport streams have proven to be well suited for carrying real-time programs or channels over IP networks; they also provide important timing information, allowing the decoder applications to reassemble the elementary streams in a time-synchronized fashion for viewer playback.

In a hybrid FMV streaming approach, the same techniques would be applied on the contribution side (e.g. air-to-ground, ground-to-cloud). However, from a distribution standpoint FMV streams are transformed from traditional MPEG-2 transport stream encapsulation to newer adaptive streaming formats such as MPEG-DASH and HLS, utilizing the ISOBMFF or fragmented mp4 (fMP4) variant as the preferred transport container type. Since these streaming protocols were originally designed to serve M&E content over the internet (such as movies and TV), there was not a well-documented methodology to carry KLV information in the consumer market like there was with MPEG-2 transport streams. However, if KLV metadata were filtered out or lost in translation, FMV content would be less useful to an analyst, providing the most basic situational awareness without the descriptive geospatial data required for processing, exploitation, and dissemination workflows.

To overcome this obstacle and preserve the mission-oriented KLV metadata in an FMV streaming environment, AWS Elemental implemented the carriage of KLV in the fMP4 containers capable of facilitating many mainstream ABR formats in use today.

Currently under evaluation by the Motion Imagery Standards Board and greater Intelligence Community, an implementation of KLV using MPEG-DASH was demonstrated and proposed to drive new standardization practices for FMV streaming within the DoD/IC. Several industry partners have already integrated the proposed method within Processing, Exploitation, and Dissemination (PED) applications, decoder clients, and test and measurement tools with demonstrated success. KLV in DASH is enabling a new era of PED processing entirely from a web-browser by leveraging the latest ABR streaming protocols combined with broadly adopted KLV metadata standards used by the community.

Mission video systems are undergoing a dramatic evolution, as OTT and streaming technology developments open up new opportunities for government agencies to access, review, and analyze video from the field like never before. Taking the above considerations into account, government organizations can greatly improve FMV services feeding the enterprise and serving disadvantaged users, while maximizing network efficiencies across their Areas of Operation (AOR).

Figure 4 – Shows an OV-1 of a hybrid FMV streaming architecture with reliable contribution and distribution techniques.

Figure 4 – Shows an OV-1 of a hybrid FMV streaming architecture with reliable contribution and distribution techniques.

 

from AWS Media Blog

Search inside videos using AWS Media and AI/ML Services

Search inside videos using AWS Media and AI/ML Services

In today’s content-driven world, it’s difficult to search for relevant content without losing productivity. Content always requires metadata to give it context and make it searchable. Tagging is a means to classify content to make it structured, indexed, and—ultimately—useful. Much time is spent on managing content and tagging it.

Manually assigning content tags is possible when you produce little new content or are interested in making content less searchable. However, manual tagging becomes impractical when you are creating lots of new content.

Content gets generated in various formats, such as text, audio, and video. A text-based search engine ranks relevancy based on tags or text inside the content. However, searches on audiovisual (AV) files are based only on the tags on the associated text, and not by what is being said during the playback.

To unfold the true power of AV content, we must use audio dialogue to provide a more granular level of metadata for the content—and make it searchable.

With an AV search solution, you should be able to perform the following tasks:

  1. Tag the dialogue.
  2. Make your AV contents searchable based on the content tagging.
  3. Jump directly to the position in the content where the searched keyword was used.

In this post, I describe how to search within AV files using the following AWS media and AI/ML services:

  • Amazon Transcribe: An automatic speech recognition (ASR) service that makes it easy for you to add speech-to-text capability to your applications.
  • Amazon Elastic Transcoder: Media transcoding in the cloud. It is designed to be a highly scalable, easy-to-use, and cost-effective way for you to convert (or “transcode”) media files from a source format. You can create versions that play back on devices like smartphones, tablets, and PCs.
  • Amazon CloudSearch: A managed service in the AWS Cloud that makes it simple and cost-effective to set up, manage, and scale a search solution for your website or application.

Other AWS Services:

  • AWS Lambda: Lets you run code without provisioning or managing servers. You pay only for the compute time that you consume.
  • Amazon API Gateway: A fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale.
  • Amazon S3: An object storage service that offers industry-leading scalability, data availability, security, and performance.

Solution overview

This solution gets created in two parts:

  • Ingesting the AV file
    • Uploading the MP4 file to an S3 bucket.
    • Transcoding the media file into audio format.
    • Transcribe the audio file into text.
    • Indexing the contents into Amazon CloudSearch.
    • Testing the index in CloudSearch.
  • Searching for content
    • Creating a simple HTML user interface for querying content.
    • Listing the results and rendering them from an S3 bucket using CloudFront.

Prerequisites

To walk through this solution, you need the following AWS resources:

{
    "Version": "2008-10-17",
    "Id": "PolicyForCloudFrontPrivateContent",
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity xxxxxxxxxxx"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<<bucketname>>/*"
        }
    ]
}

Step 1: Uploading a video to the S3 bucket

Upload a video file in MP4 format into the /inputvideo folder.

Step 2: Transcoding the video into audio format

  1. Create a Lambda function (VideoSearchDemo-Transcoder) with the runtime of your choice.
  2. Associate the Lambda execution role with the function to access the S3 bucket and Amazon CloudWatch Logs.
  3. Add an S3 trigger to the Lambda function on all ObjectCreated events in the /inputvideo folder. ­­For more information, see Event Source Mapping for AWS Services.

Use the following Node.js 6.10 Lambda code to initiate the Elastic Transcoder job and put the transcoded audio file to the /inputaudio folder.

'use strict';
var aws = require("aws-sdk");
var s3 = new aws.S3();

var eltr = new aws.ElasticTranscoder({
 region: "us-east-1"
});

exports.handler = (event, context, callback) => {
    console.log('Received event:', JSON.stringify(event, null, 2));
    
var docClient = new aws.DynamoDB.DocumentClient();
    var bucket = event.Records[0].s3.bucket.name;
    var key = event.Records[0].s3.object.key;

    var pipelineId = "ElasticTranscoder Pipeline ID";
    var audioLocation = "inputaudio";
        
    var newKey = key.split('.')[0];
    var str = newKey.lastIndexOf("/");
    newKey = newKey.substring(str+1);

     var params = {
      PipelineId: pipelineId,
      Input: {
       Key: key,
       FrameRate: "auto",
       Resolution: "auto",
       AspectRatio: "auto",
       Interlaced: "auto",
       Container: "auto"
      },
      Outputs: [
       {
        Key:  audioLocation+'/'+ newKey +".mp3",
        PresetId: "1351620000001-300010"  //mp3 320
       }
      ]
     };
    
     eltr.createJob(params, function(err, data){
      if (err){
       console.log('Received event:Error = ',err);
      } else {
       console.log('Received event:Success =',data);
      }
    });
};

Step 3: Transcribing the audio file into JSON format

Add another S3 trigger to the Lambda function on all ObjectCreated events for the /inputaudio folder, which invokes the transcription job. Fill in values for the inputaudiolocation and outputbucket.

'use strict';
var aws = require('aws-sdk');
var s3 = new aws.S3();
var transcribeservice = new aws.TranscribeService({apiVersion: '2017-10-26'});

exports.handler = (event, context, callback) => {
    console.log('Received event:', JSON.stringify(event, null, 2));
    var bucket = event.Records[0].s3.bucket.name;
    var key = event.Records[0].s3.object.key;    
    var newKey = key.split('.')[0];
    var str = newKey.lastIndexOf("/");
    newKey = newKey.substring(str+1);
    
    var inputaudiolocation = "https://s3.amazonaws.com/<<bucket name>>/<<input file location>>/";
    var mp3URL = inputaudiolocation+newKey+".mp3";
    var outputbucket = "<<bucket name>>";
    var params = {
        LanguageCode: "en-US", /* required */
        Media: { /* required */
          MediaFileUri: mp3URL
        },
        MediaFormat: "mp3", /* required */
        TranscriptionJobName: newKey, /* required */
        MediaSampleRateHertz: 44100,
        OutputBucketName: outputbucket
      };
      transcribeservice.startTranscriptionJob(params, function(err, data){
      if (err){
       console.log('Received event:Error = ',err);
      } else {
       console.log('Received event:Success = ',data);
      }
     });
};

The following is a transcribed JSON file example:

{
   "jobName":"JOB ID",
   "accountId":" AWS account Id",
   "results":{
      "transcripts":[
         {
            "transcript":"In nineteen ninety four, young Wall Street hotshot named Jeff Bezos was at a crossroads in his career"
         }
      ],
      "items":[
         {
            "start_time":"0.04",
            "end_time":"0.17",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"In"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"0.17",
            "end_time":"0.5",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"nineteen"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"0.5",
            "end_time":"0.71",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"ninety"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"0.71",
            "end_time":"1.11",
            "alternatives":[
               {
                  "confidence":"0.7977",
                  "content":"four"
               }
            ],
            "type":"pronunciation"
         },
         {
            "alternatives":[
               {
                  "confidence":null,
                  "content":","
               }
            ],
            "type":"punctuation"
         },
         {
            "start_time":"1.18",
            "end_time":"1.59",
            "alternatives":[
               {
                  "confidence":"0.9891",
                  "content":"young"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"1.59",
            "end_time":"1.78",
            "alternatives":[
               {
                  "confidence":"0.8882",
                  "content":"Wall"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"1.78",
            "end_time":"2.01",
            "alternatives":[
               {
                  "confidence":"0.8725",
                  "content":"Street"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"2.01",
            "end_time":"2.51",
            "alternatives":[
               {
                  "confidence":"0.9756",
                  "content":"hotshot"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"2.51",
            "end_time":"2.75",
            "alternatives":[
               {
                  "confidence":"0.9972",
                  "content":"named"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"2.76",
            "end_time":"3.07",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"Jeff"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"3.08",
            "end_time":"3.56",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"Bezos"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"3.57",
            "end_time":"3.75",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"was"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"3.75",
            "end_time":"3.83",
            "alternatives":[
               {
                  "confidence":"0.9926",
                  "content":"at"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"3.83",
            "end_time":"3.88",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"a"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"3.88",
            "end_time":"4.53",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"crossroads"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"4.53",
            "end_time":"4.6",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"in"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"4.6",
            "end_time":"4.75",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"his"
               }
            ],
            "type":"pronunciation"
         },
         {
            "start_time":"4.75",
            "end_time":"5.35",
            "alternatives":[
               {
                  "confidence":"1.0000",
                  "content":"career"
               }
            ],
            "type":"pronunciation"
         }
         "status":"COMPLETED"
      }

Step 4: Indexing the contents into Amazon CloudSearch

The outputbucket value from the previous step is the S3 bucket parent folder that parses the JSON output of the transcription job. Use that value to ingests the JSON output into the Amazon ES search collection. Add another S3 trigger for the Lambda function on all ObjectCreated events for that folder.

'use strict';
var aws = require("aws-sdk");
const https = require("https");
var s3 = new aws.S3();

exports.handler = (event, context, callback) => {
    console.log('Received event:', JSON.stringify(event, null, 2));
    var cloudsearchdomain = new aws.CloudSearchDomain({endpoint: '<<Mention CloudSearch End Point>>'});

    var bucket = event.Records[0].s3.bucket.name;
    var key = event.Records[0].s3.object.key;
    var newKey = key.split('.')[0];
    var str = newKey.lastIndexOf("/");
    newKey = newKey.substring(str+1);

    var params = {
        Bucket: bucket,
        Key: key
    }
    
    s3.getObject(params, function (err, data) {
    if (err) {
        console.log(err);
    } else {
        console.log(data.Body.toString()); //this will log data to console
        var body = JSON.parse(data.Body.toString());
        
        var indexerDataStart = '[';
        var indexerData = '';
        var indexerDataEnd = ']';
        var undef;
        var fileEndPoint = "http://<<CloudFront End Point URL>>/inputvideo/"+newKey+".mp4";

        for(var i = 0; i < body.results.items.length; i++) {

            if (body.results.items[i].start_time != undef &&
                body.results.items[i].end_time != undef &&
                body.results.items[i].alternatives[0].confidence != undef &&
                body.results.items[i].alternatives[0].content != undef &&
                body.results.items[i].type != undef &&
                fileEndPoint != undef
            ) {
                if (i !=0){
                    indexerData = indexerData + ',';
                }
                indexerData = indexerData + '{\"type\": \"add\",';
                indexerData = indexerData + '\"id\":\"'+i+'\",';
                indexerData = indexerData + '\"fields\": {';
                    indexerData = indexerData + '\"start_time\":'+'\"'+body.results.items[i].start_time+'\"'+',';
                    indexerData = indexerData + '\"end_time\":'+'\"'+body.results.items[i].end_time+'\"'+',';
                    indexerData = indexerData + '\"confidence\":'+'\"'+body.results.items[i].alternatives[0].confidence+'\"'+',';
                    indexerData = indexerData + '\"content\":'+'\"'+body.results.items[i].alternatives[0].content+'\"'+',';
                    indexerData = indexerData + '\"type\":'+'\"'+body.results.items[i].type+'\"'+',';
                    indexerData = indexerData + '\"url\":'+'\"'+fileEndPoint+'\"';
                indexerData = indexerData + '}}';
            }
          }
        var csparams = {contentType: 'application/json', documents : (indexerDataStart+indexerData+indexerDataEnd) };

        cloudsearchdomain.uploadDocuments(csparams, function(err, data) {
            if(err) {
                console.log('Error uploading documents to cloudsearch', err, err.stack);
            } else {
                console.log("Uploaded Documents to cloud search successfully!");
            }
        });
    }
})
};

Step 5: Testing the index in CloudSearch

In the CloudSearch console, on your domain dashboard, validate that the contents are indexed in the CloudSearch domain. The Searchable documents field should have a finite number.

Run a test search on the CloudSearch domain, using a keyword that you know is in the video. You should see the intended results, as shown in the following test search screenshot:

Step 6: Querying contents with a simple HTML user interface

Now it’s time to search for the content keywords. In API Gateway, create an API to query on the CloudSearch domain using the following Lambda code:

var aws = require('aws-sdk');

var csd = new aws.CloudSearchDomain({
  endpoint: '<<CloudSearch Endpoint>>',
  apiVersion: '2013-01-01'
});

exports.handler = (event, context, callback) => {
    var query;
    if (event.queryStringParameters !== null && event.queryStringParameters !== undefined) {
        if (event.queryStringParameters.query !== undefined &&
            event.queryStringParameters.query !== null &&
            event.queryStringParameters.query !== "") {
            console.log("Received name: " + event.queryStringParameters.query);
            query = event.queryStringParameters.query;
        }
    }
    console.log('Received event: EventName= ', query);

    var params = {
          query: query /* required */
    };

     csd.search(params, function (err, data) {
        if (err) {console.log(err, err.stack);callback(null, null);} // an error occurred
        else     {
            var response = {
                "statusCode": 200,
                "body": JSON.stringify(data['hits']['hit']),
                "headers": { "Access-Control-Allow-Origin": "*", "Content-Type": "application/json" }
            };
            callback(null, response);
        }          // successful response
    });
};

Deploy and test the API in API Gateway. The following screenshot shows an example execution diagram.

Host a static HTML user interface component in S3 with public access under the /static folder. Use the following code to build the HTML file, replacing the value for api_gateway_url with the CloudFront URL.

When you test the page, enter a keyword in the search box and choose Search. The CloudSearch API is called.

<!DOCTYPE html>
<html lang='en'>
  <head>
    <meta charset='utf-8'>
    <meta http-equiv='X-UA-Compatible' content='IE=edge'>
    <meta name='viewport' content='width=device-width, initial-scale=1'>
    <title>CloudSearch - Contents</title>
    <link rel='stylesheet' href='https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css' integrity='sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u' crossorigin='anonymous'>
  </head>
  <body>
    <div class='container'>
        <h1>Video Search - Contents</h1>
        <p>This is video search capability with CloudSearch, Lambda, API Gateway and static web hosting </p>

        <div class="form-group">
          <label for="usr">Search String:</label>
          <input type="text" class="form-control" id="query">
          <button id="search">Search</button>

        </div>

        <div class='table-responsive'>
            <table class='table table-striped' style='display: none'>
                <tr>
                    <th>Content</th>
                    <th>Confidence</th>
                    <th>Start Time</th>
                    <th>Video</th>
                </tr>
            </table>
        </div>
    </div>
    <script src='https://ajax.googleapis.com/ajax/libs/jquery/1.12.4/jquery.min.js'></script>
    <script>

      $(search).click(function() {
      $('table.table').empty();
        var query = $( "#query" ).val();
        if (!query) {
          alert('Please enter search string');
          return false;
        }
        var api_gateway_url = 'https://<<Mention API Endpoint>>/prod/?query='+query;
        var rows = [];
        $.get( api_gateway_url, function( data ) {
          if (data.length>0){
            rows.push(`                <tr> \
                                  <th>Content</th> \
                                  <th>Confidence</th> \
                                  <th>Start Time</th> \
                                  <th>Video</th> \
                              </tr> \
                              <hr> `);

            data.forEach(function(item) {
                console.log('your message'+item);
                var start = item['fields']['start_time'];
                var source = item['fields']['url']+"#t="+start;
                rows.push(`<tr> \
                    <td>${item['fields']['content']}</td> \
                    <td>${item['fields']['confidence']}</td> \
                    <td>${item['fields']['start_time']}</td> \
                    <td><video controls id="myVideo" width="320" height="176"><source src=${source} type="video/mp4"></video> </td> \

                </tr>`);
            });
            // show the now filled table and hide the "loading" message
            $('table.table').append(rows.join()).show();
         }

        });

    });
    </script>
  </body>
</html>

Step 6: Listing the resulting videos and rendering them from the S3 location using CloudFront

The results page should look something like the following:

The query response JSON has a link to the CloudFront video and the position in video where the keyword was mentioned. The following HTML video control tag, already included in the static HTML code earlier, lets you invoke the video from the point where the keyword was mentioned.

var source = item['fields']['url']+"#t="+start;
<td><video controls id="myVideo" width="320" height="176"><source src=${source} type="video/mp4"></video> </td> \

Cleanup

To avoid incurring future costs after you’re done with this walkthrough, delete the resources that you created:

For AWS services and resources that invoke your Lambda function directly, first delete the trigger in the service where you originally configured it. For this example, delete the Lambda function trigger in API Gateway. Then, delete the Lambda function in the Lambda console.

Conclusion

This post showed you how to build an AV search solution using various AWS media and AI/ML services such as Transcribe, Elastic Transcoder, CloudSearch, Lambda, S3, and CloudFront.

You can easily integrate the solution with any existing search functionality. Give your viewers the ability to search within AV files and find relevant audio content without losing productivity.

Some representative use cases for this solution could include the following:

  • EdTech content search. The education tech industry publishes the majority of its content in video format for schools, colleges, and competitive exams. Tagging each file is a time-consuming task and may not even be comprehensive. This video search functionality could improve the productivity of both content authors and end users.
  • Customer service and sentiment analysis. The most common use of media analytics is to mine customer sentiment to support marketing and customer service activities. Customer service call recording gives great insights to customer sentiments. This solution can improve the productivity of call center agents by giving them ability to search historic audio contents and learn how to provide a better customer experience.

If you have any comments or questions about this post, please let us know in the comments.

from AWS Media Blog

Broadcasting TV channels using the Veset Nimbus playout platform on AWS

Broadcasting TV channels using the Veset Nimbus playout platform on AWS

Co-Authored by Zavisa Bjelorgrlic, AWS M&E Partner SA, Igor Kroll, CEO and Gatis Gailis, CTO & Founder from Veset. The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

Many video content owners use the Veset Nimbus playout platform to originate linear TV channels from AWS. This solution requires no upfront investment nor long-term commitment. It is a fast and economically viable option to create and manage linear TV channels.

With Veset Nimbus, a channel can be launched, modified, or shut down more quickly than with a traditional broadcasting platform. It can also be delivered to different distribution platforms—both OTT and traditional. This post describes the benefits of using this broadcasting platform.

Overview

You can access Veset Nimbus through a monthly subscription, self-service SaaS platform managed through a web interface. To sign up, you must have an AWS account. You pay a fixed monthly fee per channel to Veset and are charged per actual use for the AWS resources used.

The platform automatically provisions AWS resources, such as Amazon S3, Amazon S3 Glacier, and Amazon EC2 instances, and does not require any additional engineering involvement on your part.

Over 65 linear TV channels operate on the platform. The majority of the channels are in high definition (HD), some are in standard definition (SD), and a few have 4K definition or Ultra HD (UHD). There are 21 customers in 15 countries, including Fashion TV, Ginx TV, XITE, In The Box, ProSiebenSat.1 Puls 4’s Zappn TV, A1 Telekom Austria, Siminn, GospelTruth TV, Levira, and YuppTV.

Channels are distributed through the following methods:

  • Traditional broadcast distribution using a multichannel video programming distributor (MPVD) such as satellite, terrestrial, cable, and IPTV headends.
  • Over the top (OTT) directly to end users, including YouTube, PlayStation Vue, Samsung, FuboTV, and ZAPPN TV.

Solution architecture

Veset Nimbus is a multi-tenant SaaS platform, which is highly available and automatically scaling to sustain the processing load. The system uses microservices running in Docker containers deployed in the Spotinst Ocean cluster in AWS.

The Spotinst platform allows the seamless use of EC2 Spot Instances, On-Demand Instances, and Reserved Instances, leading to significant saving in costs. You can also use the platform to manage automatic scaling for different ingestion loads and to provide resilience by spreading the containers over different Availability Zones.

Veset Nimbus allows you to manage channels on two tiers:

  • An initial “broadcaster-level” tier with one content library stored in an Amazon S3 bucket and multiple channels created from this library.
  • A higher “service provider-level” tier, which can group multiple broadcaster-level organizations.

Media assets (both source files and transcoded versions) can be moved optionally from S3 to Amazon S3 Glacier for long-term archiving using lifecycle policies. The following diagram shows the Veset Nimbus solution architecture in the 1+1 Hot Standby configuration.

The steps to create a channel are as follows:

  • Ingest content: Content is ingested from S3 buckets pre-loaded by content owners or fetched from customer-designated repositories (external S3 bucket, FTP, SFTP, and public clouds). Aspera and Signiant are used for accelerated delivery or large video contribution. The ingest process extracts metadata and makes copies to the library S3 bucket. Lifecycle policies can be activated to save copies of the original and processed inputs to Amazon S3 Glacier vaults as a long-term
  • Input live content: Live stream inputs are connected directly to playout using different modes: stream in UDP/RTP arriving through AWS Elemental MediaConnect or Real Time Messaging Protocol (RTMP) or HTTP live streaming (HLS) streams directly connected to playout. If the feeds contain SCTE-35 markers, then those can be used for live switching or ad insertion.
  • Create a playlist: A playlist is a frame-precise schedule that defines primary (video assets from S3 or live streams) and secondary events (broadcast graphics, SCTE35 markers, EPG data) to be played out in the channel. Veset Nimbus incorporates an add-on module (Veset Altos) for long-term planning and scheduling. Veset Altos manages customer workflows to create daily playlists. You can also use playlists imported from third-party software.
  • Create a channel: The channel is created using content library/live events and the playlist created in the previous steps. Different options can be configured with SD, HD, or UHD definition and different output encoded in MPEG2, AVC, or HEVC.
  • Add redundancy: The channel configuration offers different redundancy levels to match the availability and cost requirements:
    • Basic option with a single playout configuration.
    • 1+1 Warm Backup configuration with a backup playout in standby mode in a different AWS Availability Zone. It’s activated automatically by a monitoring alert, or switched manually during maintenance or software-update periods.
    • 1+1 Hot Backup with two synchronized active playouts running on separate EC2 instances in different Availability Zone
    • (Optional) Cross-Region backup channel configured in a different AWS Region.

Veset Nimbus playout has two deployment options:

  • Full cloud—All components of the system reside in AWS.
  • Hybrid playout—Offers the flexibility to deploy main, backup, or both of the playout channels outside AWS.

Veset recommends using the full cloud deployment whenever possible, due to a higher level of redundancy and agility of cloud architecture. In some cases, however, you might opt for the hybrid playout deployment as a backup to cloud playout.

The channel’s playout components are deployed on EC2 instances or on-premises servers when you use hybrid playout deployment. The configuration depends on launch options for redundancy, resolution, and output profiles.

The content to play out is cached from S3 on an Amazon EBS disk or local storage (in the case of the hybrid playout deployment). Caching improves reliability and prevents delays in playout in case of longer time required to obtain contents from S3. Cached contents are encoded and multiplexed with incoming live streams.

The playout enables the live application of graphic overlays, subtitling and management of up to 8 multiple audio tracks. The module inserts ad breaks with SCTE-35 triggers, and it generates EPG data. For more traditional and regulated linear distribution, playout creates as-run logs and runs a compliance recording for each playout.

For increased redundancy, the channel’s primary and backup playouts can be distributed between different Availability Zones of the AWS Region. Every instance is monitored by Amazon CloudWatch and restarted automatically in case of failure. EC2 instances are selected on the basis of the video quality needed for optimizing the costs of operation. A typical setting uses different instance types with the Windows operating system:

  • SD video: c5.xlarge
  • HD video: c5.2xlarge
  • UHD Video (25 fps), HEVC output: g3.4xlarge
  • UHD (50 fps) AVC output: p3.2xlarge

Instance types are regularly reviewed for performance using CloudWatch to gather AWS and application logs. Instances are also verified against new instances and families made available on AWS to improve, whenever possible, the operating costs.

You deploy the playout automatically using SaltStack. The automatic deployment allows easy reconfiguration of the EC2 instance types to match your resolution and quality needs. It also allows deployment in multiple Availability Zones, simple activation on a different AWS Region, and configuration of all other details like subnets, security groups, network ACL, and CloudWatch monitoring.

In addition to CloudWatch, Veset Nimbus uses internally built monitoring systems to ensure that EC2 instances are available, and that the outputs from each of the playouts are aligned with predefined parameters. You can configure alerts of variable level of sensitivity for incidents, such as black or frozen video, jitter, or no audio.

Distribution

Veset Nimbus has multiple ways to deliver feeds originated by its playout to various designated distribution points:

  • The video stream created is sent in real-time transport protocol (RTP) to AWS Elemental MediaLive for real-time encoding. MediaLive is a broadcast-grade live video-processing service with built-in self-healing and high availability. The output of MediaLive can be sent to a CDN, such as Amazon CloudFront for OTT distribution.If you require delivery to multichannel video programming distributor (MPVD) head-ends, then use AWS Elemental MediaConnect.
  • For just-in-time packaging and where you require MPEG dynamic adaptive streaming over HTTP (DASH) output, insert AWS Elemental MediaPackage in the flow immediately after the output from MediaLive.
  • All Veset Nimbus playouts have Zixi Broadcaster preinstalled and integrated into the management portal. Use Zixi Broadcaster to manage multiple incoming live feeds, create multiple outputs, do live recording, and transport outputs to one or several distribution destinations.

Conclusion

Running the Veset Nimbus playout platform on AWS provides you with an extremely fast and economically viable tool to originate linear TV channels. A channel can be launched, modified, or shut down in minutes—compared to months to set up a traditional broadcasting platform. It can be delivered in various flavors to different distribution platforms—both OTT and traditional. The combination of Veset Nimbus and AWS Managed Services like MediaLive, MediaPackage, and MediaConnect lets you solve distribution and monetization challenges in one seamlessly managed and transparently priced environment.

High uptimes and reliability demand of linear television, combined with increased cost efficiency and flexibility, are addressed through a robust solution architecture. The use of different EC2 instances for different types of video quality allows for optimization of operational costs. Using tools such as Spotinst to leverage a mix of EC2 instances brings further efficiencies and resilience. Furthermore, by running playouts on instances in multiple Availability Zones, or even separate AWS Regions, Veset Nimbus offers a solution with high availability at the levels required in broadcasting.

from AWS Media Blog

AWS Media Services Demo: Live Streaming with Automated Multi-Language Subtitling

AWS Media Services Demo: Live Streaming with Automated Multi-Language Subtitling

Until now, to create captions for a live event, stenographers had to be listening closely to capture the speech to convert it to text. To make it more complex, for live content, this has to be done in as close to real-time as possible. If the captions were needed in additional languages, the text was then sent to a translator, which makes it almost impossible to do for live content. These steps are labour-intensive and expensive. To streamline this process, AWS introduced a Live Streaming with Automated Multi-Language Subtitling solution which simplifies the labor-intensive process for real-time transcription for caption creation, and translation for multi-language subtitling.

Live Streaming with Automated Multi-Language Subtitling allows you to capture, transcribe your content to make it more accessible at a lower cost, and translate those caption to easily regionalize content in languages you wouldn’t have before to make it available across the globe.

Watch the five minute video below to see the Live Streaming With Automated Multi-Language Subtitling solution in action. Once you’ve done that, head over to the Solutions page use the solution out-of-the-box, customize the solution to meet your specific use case, or work with AWS Partner Network (APN) partners to implement an end-to-end subtitling workflow.

 

from AWS Media Blog

In the News: Fox Sports Uses AWS S3 to Enable Remote Production for FIFA Women’s World Cup

In the News: Fox Sports Uses AWS S3 to Enable Remote Production for FIFA Women’s World Cup

From Sports Video Group News, read how Fox Sports is using AWS along with APN partners to power their coverage of the 2019 FIFA Women’s World Cup in Paris. With Amazon S3 at the core, Fox Sports’ Paris-based studio-show works on the live production in conjunction with operations, graphics, and playout based out of Los Angeles.

You can learn more about how they used a similar workflow for the 2018 Russia World Cup in the video below:

from AWS Media Blog

AWS Thinkbox Partners Help take Your Studio to the Cloud

AWS Thinkbox Partners Help take Your Studio to the Cloud

Written by Chris Bober, Sr Partner Manager – Thinkbox


AWS Thinkbox Partners: Autodesk, Chaos Group, Houdini, Isotropix, OTOY, ftrack, NIM, BeBop, Teradici, Sohonet, WekaIO, Qumulo

Today we’re announcing the new AWS Thinkbox partner site which highlights AWS Thinkbox and APN partners that can help accelerate your journey to a hybrid or even a full content production pipeline in the cloud. The partners on this page are uniquely positioned to help your studio meet the demands of rapid and efficient content production by leveraging the power of AWS.

These partners are dedicated to help creative studios focused on computer graphics (CG) or editing content to work in a secure, scalable, and repeatable way that not only helps scale studio infrastructure, but creativity as well. Each partner offers solutions for elastic rendering, virtual workstations, studio management, and asset management to power all aspects of your cloud studio, enabling you to quickly adapt to changing requirements and to scale your studio based on increased or decreased demands. With AWS Thinkbox and our partner solutions you can optimize your costs in ways not possible with dated, monolithic, on-premises infrastructure.

To learn more about our partner solutions begin by exploring https://www.thinkboxsoftware.com/partners.

To learn how AWS Thinkbox is enabling the Studio in the Cloud and how that fits into other Media & Entertainment services at AWS, check out the Content Production section of the Media Value Chain eBook.

from AWS Media Blog

Vroom! NASCAR Puts Fans in the Race with AWS Media Services

Vroom! NASCAR Puts Fans in the Race with AWS Media Services

Just about every weekend between February and November, NASCAR thrills U.S. motorsports enthusiasts with some of the highest-horsepower racing action around. The sights, sounds, smells, and experiences of U.S. stock car racing are unlike any other: race cars reaching nearly 200 miles per hour (322 km/h); engines revving past 9,000 rpm; exceeding three G-forces in the turns. Elite drivers’ skills and emotions—and the crew chief geniuses making late-pit-stop strategy decisions to keep the professional drivers on course to win. Pit crews who move with the speed and precision of professional athletes. And the millions of passionate race fans cheering on their favorite drivers and crews at the track and on an ever-increasing variety of smart devices and mobile phones.

Over the past 70-plus years, NASCAR has become the most popular motorsport in the U.S. Also counting racing enthusiasts abroad among its fan base, NASCAR events are broadcast in more than 185 countries and territories. Before 1979, fans watched race day action on television and had to content themselves with 15- to 30-minute video highlight packages on sports programs from the big U.S. networks. In 1979, the entire Daytona 500 was broadcast live for the first time ever – and post-race fisticuffs attracted passionate fans across the country. Continually finding new ways for an increasingly large audience of fans to engage NASCAR action online and on mobile devices, the racing association introduced a new digital platform in 2015. Once again, NASCAR pulled ahead: in 2018, the racing circuit reported race-day digital content consumption up 30% year-over-year and digital video views increased nearly 50% in 2018. With fans demanding more immersive race experiences than ever, NASCAR is putting viewers inside the race cars. The NASCAR Drive mobile app lets subscribers slip in the race car alongside the driver to experience every curve of the track, easily access real-time race stats and predictions, and even gain deeper insight into NASCAR history.

DAYTONA BEACH, FL – FEBRUARY 21: Denny Hamlin, driver of the #11 FedEx Express Toyota, takes the checkered flag ahead of Martin Truex Jr., driver of the #78 Bass Pro Shops/Tracker Boats Toyota, to win the NASCAR Sprint Cup Series DAYTONA 500 at Daytona International Speedway on February 21, 2016 in Daytona Beach, Florida. (Photo by Chris Trotman/NASCAR via Getty Images)

Second-screen experiences are a critical component of consuming live NASCAR content for race fans. For example, the number of unique viewers live streaming FOX coverage of the Daytona 500 to their laptops, tablets, phones, and connected devices increased by more than 50% from 2018, while both FOX and NBC’s streaming numbers for NASCAR coverage have gone up four years in a row. “We give our fans a lot of ways to consume live NASCAR content, such as NASCAR Drive, a 360-camera feed, and communications between drivers and their crew chiefs. We’re also providing on-demand content 24/7 so that NASCAR is at fans’ fingertips when and where they want it,” said Steve Stum, Vice President of Operations and Technical Production, NASCAR.

NASCAR Digital uses AWS Media Services to power its NASCAR Drive mobile app, and to deliver broadcast-quality content for more than 80 million fans worldwide across a wide range of delivery formats and device platforms. AWS Media Services, including AWS Elemental MediaLive and AWS Elemental MediaStore, help NASCAR provide fans instant access to the driver’s view of the race track during races, augmented by audio and a continually updated leaderboard. To ensure the highest-quality video stream and lowest latency for fans who choose to experience race day on a second screen, NASCAR Digital uses these tools to package, process and store the video for delivery via Amazon CloudFront. NASCAR will also leverage AWS for cloud-based machine learning and artificial intelligence services to help streamline formerly labor-intensive processes, including Amazon Rekognition, Amazon Transcribe, and Amazon SageMaker.  This combination of AWS services helps NASCAR deliver more content, innovate new services more efficiently, and scale without compromising time and capital.

“Moving into the cloud has been our best decision yet, eliminating a lot of the friction and overhead associated with traditional production and delivery,” said Stum. “We look forward to continue using the technology to push the limits of what’s possible for our second-screen viewing experience and beyond.”

——-

Feature Image: DAYTONA BEACH, FL – FEBRUARY 17: William Byron, driver of the #24 Axalta Chevrolet, and Alex Bowman, driver of the #88 Nationwide Chevrolet, lead the field to the green flag to start the Monster Energy NASCAR Cup Series 61st Annual Daytona 500 at Daytona International Speedway on February 17, 2019 in Daytona Beach, Florida. (Photo by Jared C. Tilton/Getty Images)

from AWS Media Blog