Tag: EC2

Major Wholesaler Grows Uptime by Refactoring eComm Apps for AWS DevOps

Major Wholesaler Grows Uptime by Refactoring eComm Apps for AWS DevOps

AWS Case Study Ecommerce Cloud Refactor

A recent IDC survey of the Fortune 1000 found that the average cost of an infrastructure failure is $100,000 per hour and the average total cost of unplanned application downtime per year is between $1.25 billion and $2.5 billion. Our most recent customer relies heavily on its eCommerce site for business and knowing the extreme costs of infrastructure failure to its business, turned to the benefits of cloud-based DevOps. The firm sought to increase uptime, scalability, and security for its eCommerce applications by refactoring them for AWS DevOps.

What is Refactoring?

Refactoring involves an advanced process of re-architecting and often re-coding some portion of an existing application to take advantage of cloud-native frameworks and functionality. While this approach can be time-consuming and resource-intensive, it offers low monthly cloud spend as organizations that refactor are able to modify their applications and infrastructure to take full advantage of cloud-native features and thereby maximize operational cost efficiencies in the cloud.

AWS DevOps Refactoring

Employing the DevOps consulting team at Flux7 to help architect and build a DevOps platform solution, the team’s first goal was to ensure that the applications were architected for high availability at all levels in order to meet the company’s aggressive SLA goals. Here, the first step was to build a common DevOps platform for the company’s eCommerce applications and migrate the underlying technology to a common stack consisting of ECS, CloudFormation, and GoCD, an open source build and release tool from ThoughtWorks. (In the process, the team migrated one of the two applications from Kubernetes and Terraform to the new technology stack.)

As business-critical applications for the future of the retailer, the eCommerce applications needed to provide greater uptime scalability and data security than the legacy, on-premises applications from which they were refactored. As a result, the AWS experts at Flux7 built a CI/CD platform using AWS DevOps best practices, effectively reducing manual tasks and thereby increasing the team’s ability to focus on strategic work.

Further, the Flux7 DevOps team worked alongside the retailer’s team to:

  • Migrate the refactored applications to new AWS Accounts using the new CI/CD platform;
  • Automate remediation, recovering from failures faster;
  • Create AWS Identity and Access Management (IaM) resources as infrastructure as code (IaC);
  • Deliver the new applications in a Docker container-based microservices environment;
  • Deploy CloudWatch and Splunk for security and log management; and
  • Create DR procedures for the new applications to further ensure uptime and availability.

Moving forward, application updates will be rolled out via a blue-green deployment process that Flux7 helped the firm establish in order to achieve its zero downtime goals.

Business Benefits

While the customer team is a very advanced developer team, they were able to further their skills, learning through Flux7 knowledge transfer sessions how to enable DevOps best practices and continue to accelerate the new AWS DevOps platform adoption. At an estimated downtime cost of 6x the industry average, this firm couldn’t withstand the financial or reputational impact of a downtime event. As a result, the team is happy to report that it is meeting its zero downtime SLA objectives, enabling continuous system availability and with it growing customer satisfaction.

Subscribe to the Flux7 Blog
 

from Flux7 DevOps Blog

AWS Case Study: Energy Leader Digitizes Library for Analytics, Compliance

AWS Case Study: Energy Leader Digitizes Library for Analytics, Compliance

AWS Case Study Energy Leader Textract

The oil and gas industry has a rich history and one that is deeply intertwined with regulation — with Federal and State rules that regulate everything from exploration to production and transportation to workplace safety. As a result, our latest customer had amassed millions of paper documents to ensure its ability to prove compliance. It also maintained files with vast amounts of geological data, that served as the backbone of its intellectual property.

With over seven million physical documents saved and filed in deep storage, this oil and gas industry leader called the AWS consulting services team at Flux7 for its help digitizing its vast document library. In the process, it also wanted to make it easy to archive documents moving forward, and ensure that its operators could easily search for and find data.

Read the full AWS Case Study here.

Working with AWS Consulting Partner Flux7, the company created a working plan to digitize and catalog its vast document library. AWS had recently announced at re:Invent a new tool, Amazon Textract, which although still in preview mode, was the ideal tool for the task.

What is Textract?

For those of you unfamiliar with Amazon Textract, it is a new service that uses machine learning to automatically extract text and data from scanned documents. Unlike Optical Character Recognition (OCR) solutions, it also identifies the contents of fields in forms and information stored in tables, which allows users to conduct full data analytics on documents once they are digitized.

The Textract Proof of Concept

The proof of concept included several dozen physical documents that were scanned and uploaded to S3. From here, Lambda functions were triggered which launched Textract. In addition to the data being presented to Kibana, URLs for specific documents are presented to users.

As Amazon Textract automatically detects the key elements in a document or data relationships in forms and tables, it is able to extract data within the context it was originally created. With a core set of key parameters, such as revision date, extracted by Textract, operators will be able to search by key business parameters.

Analytics and Compliance

Interfacing with the data via Kibana, end users can now create smart search indexes which allow them to quickly and easily find key business data. Moreover, operators can build automated approval workflows and better meet document archival rules for regulatory compliance. Moreover, no longer does the company need to send an employee in their car to retrieve files from the warehouse, saving time from a labor-intensive task.

At Flux7, we relish the ability to help organizations apply automation and free their employees from manual tasks, replacing it with time to focus on strategic, business-impacting work. Read more Energy industry AWS case studies for best practices in cloud-based DevOps automation for enterprise agility.

For five tips on how to apply DevOps in your Oil, Gas or Energy enterprise, check out this article our CEO, Dr. Suleman, recently wrote for Oilman magazine. (Note that a free subscription is required.) Or, download the full case study here today.

Subscribe to the Flux7 Blog
 

from Flux7 DevOps Blog

Improving and securing your game-binaries distribution at scale

Improving and securing your game-binaries distribution at scale

This post is contributed by Yahav Biran | Sr. Solutions Architect, AWS and Scott Selinger | Associate Solutions Architect, AWS 

One of the challenges that game publishers face when employing CI/CD processes is the distribution of updated game binaries in a scalable, secure, and cost-effective way. Continuous integration and continuous deployment (CI/CD) processes enable game publishers to improve games throughout their lifecycle.

Often, CI/CD jobs contain minor changes that cause the CI/CD processes to push a full set of game binaries over the internet. This is a suboptimal approach. It negatively affects the cost of development network resources, customer network resources (output and input bandwidth), and the time it takes for a game update to propagate.

This post proposes a method of optimizing the game integration and deployments. Specifically, this method improves the distribution of updated game binaries to various targets, such as game-server farms. The proposed mechanism also adds to the security model designed to include progressive layers, starting from the Amazon EC2 instance that runs the game server. It also improves security of the game binaries, the game assets, and the monitoring of the game server deployments across several AWS Regions.

Why CI/CD in gaming is hard today

Game server binaries are usually a native application that includes binaries like graphic, sound, network, and physics assets, as well as scripts and media files. Game servers are usually developed with game engines like Unreal, Amazon Lumberyard, and Unity. Game binaries typically take up tens of gigabytes. However, because game developer teams modify only a few tens of kilobytes every day, frequent distribution of a full set of binaries is wasteful.

For a standard global game deployment, distributing game binaries requires compressing the entire binaries set and transferring the compressed version to destinations, then decompressing it upon arrival. You can optimize the process by decoupling the various layers, pushing and deploying them individually.

In both cases, the continuous deployment process might be slow due to the compression and transfer durations. Also, distributing the image binaries incurs unnecessary data transfer costs, since data is duplicated. Other game-binary distribution methods may require the game publisher’s DevOps teams to install and maintain custom caching mechanisms.

This post demonstrates an optimal method for distributing game server updates. The solution uses containerized images stored in Amazon ECR and deployed using Amazon ECS or Amazon EKS to shorten the distribution duration and reduce network usage.

How can containers help?

Dockerized game binaries enable standard caching with no implementation from the game publisher. Dockerized game binaries allow game publishers to stage their continuous build process in two ways:

  • To rebuild only the layer that was updated in a particular build process and uses the other cached layers.
  • To reassemble both packages into a deployable game server.

The use of ECR with either ECS or EKS takes care of the last mile deployment to the Docker container host.

Larger application binaries mean longer application loading times. To reduce the overall application initialization time, I decouple the deployment of the binaries and media files to allow the application to update faster. For example, updates in the application media files do not require the replication of the engine binaries or media files. This is achievable if the application binaries can be deployed in a separate directory structure. For example:

/opt/local/engine

/opt/local/engine-media

/opt/local/app

/opt/local/app-media

Containerized game servers deployment on EKS

The application server can be deployed as a single Kubernetes pod with multiple containers. The engine media (/opt/local/engine-media), the application (/opt/local/app), and the application media (/opt/local/app-media) spawn as Kubernetes initContainers and the engine binary (/opt/local/engine) runs as the main container.

apiVersion: v1
kind: Pod
metadata:
  name: my-game-app-pod
  labels:
    app: my-game-app
volumes:
      - name: engine-media-volume
          emptyDir: {}
      - name: app-volume
          emptyDir: {}
      - name: app-media-volume
          emptyDir: {}
      initContainers:
        - name: app
          image: the-app- image
          imagePullPolicy: Always
          command:
            - "sh"
            - "-c"
            - "cp /* /opt/local/engine-media"
          volumeMounts:
            - name: engine-media-volume
              mountPath: /opt/local/engine-media
        - name: engine-media
          image: the-engine-media-image
          imagePullPolicy: Always
          command:
            - "sh"
            - "-c"
            - "cp /* /opt/local/app"
          volumeMounts:
            - name: app-volume
              mountPath: /opt/local/app
        - name: app-media
          image: the-app-media-image
          imagePullPolicy: Always
          command:
            - "sh"
            - "-c"
            - "cp /* /opt/local/app-media"
          volumeMounts:
            - name: app-media-volume
              mountPath: /opt/local/app-media
spec:
  containers:
  - name: the-engine
    image: the-engine-image
    imagePullPolicy: Always
    volumeMounts:
       - name: engine-media-volume
         mountPath: /opt/local/engine-media
       - name: app-volume
         mountPath: /opt/local/app
       - name: app-media-volume
         mountPath: /opt/local/app-media
    command: ['sh', '-c', '/opt/local/engine/start.sh']

Applying multi-stage game binaries builds

In this post, I use Docker multi-stage builds for containerizing the game asset builds. I use AWS CodeBuild to manage the build and to deploy the updates of game engines like Amazon Lumberyard as ready-to-play dedicated game servers.

Using this method, frequent changes in the game binaries require less than 1% of the data transfer typically required by full image replication to the nodes that run the game-server instances. This results in significant improvements in build and integration time.

I provide a deployment example for Amazon Lumberyard Multiplayer Sample that is deployed to an EKS cluster, but this can also be done using different container orchestration technology and different game engines. I also show that the image being deployed as a game-server instance is always the latest image, which allows centralized control of the code to be scheduled upon distribution.

This example shows an update of only 50 MB of game assets, whereas the full game-server binary is 3.1 GB. With only 1.5% of the content being updated, that speeds up the build process by 90% compared to non-containerized game binaries.

For security with EKS, apply the imagePullPolicy: Always option as part of the Kubernetes best practice container images deployment option. This option ensures that the latest image is pulled every time that the pod is started, thus deploying images from a single source in ECR, in this case.

Example setup

  • Read through the following sample, a multiplayer game sample, and see how to build and structure multiplayer games to employ the various features of the GridMate networking library.
  • Create an AWS CodeCommit or GitHub repository (multiplayersample-lmbr) that includes the game engine binaries, the game assets (.pak, .cfg and more), AWS CodeBuild specs, and EKS deployment specs.
  • Create a CodeBuild project that points to the CodeCommit repo. The build image uses aws/codebuild/docker:18.09.0: the built-in image maintained by CodeBuild configured with 3 GB of memory and two vCPUs. The compute allocated for build capacity can be modified for cost and build time tradeoff.
  • Create an EKS cluster designated as a staging or an integration environment for the game title. In this case, it’s multiplayersample.

The binaries build Git repository

The Git repository is composed of five core components ordered by their size:

  • The game engine binaries (for example, BinLinux64.Dedicated.tar.gz). This is the compressed version of the game engine artifacts that are not updated regularly, hence they are deployed as a compressed file. The maintenance of this file is usually done by a different team than the developers working on the game title.
  • The game binaries (for example, MultiplayerSample_pc_Paks_Dedicated). This directory is maintained by the game development team and managed as a standard multi-branch repository. The artifacts under this directory get updated on a daily or weekly basis, depending on the game development plan.
  • The build-related specifications (for example, buildspec.yml  and Dockerfile). These files specify the build process. For simplicity, I only included the Docker build process to convey the speed of continuous integration. The process can be easily extended to include the game compilation and linked process as well.
  • The Docker artifacts for containerizing the game engine and the game binaries (for example, start.sh and start.py). These scripts usually are maintained by the game DevOps teams and updated outside of the regular game development plan. More details about these scripts can be found in a sample that describes how to deploy a game-server in Amazon EKS.
  • The deployment specifications (for example, eks-spec) specify the Kubernetes game-server deployment specs. This is for reference only, since the CD process usually runs in a separate set of resources like staging EKS clusters, which are owned and maintained by a different team.

The game build process

The build process starts with any Git push event on the Git repository. The build process includes three core phases denoted by pre_build, buildand post_build in multiplayersample-lmbr/buildspec.yml

  1. The pre_build phase unzips the game-engine binaries and logs in to the container registry (Amazon ECR) to prepare.
  2. The buildphase executes the docker build command that includes the multi-stage build.
    • The Dockerfile spec file describes the multi-stage image build process. It starts by adding the game-engine binaries to the Linux OS, ubuntu:18.04 in this example.
    • FROM ubuntu:18.04
    • ADD BinLinux64.Dedicated.tar /
    • It continues by adding the necessary packages to the game server (for example, ec2-metadata, boto3, libc, and Python) and the necessary scripts for controlling the game server runtime in EKS. These packages are only required for the CI/CD process. Therefore, they are only added in the CI/CD process. This enables a clean decoupling between the necessary packages for development, integration, and deployment, and simplifies the process for both teams.
    • RUN apt-get install -y python python-pip
    • RUN apt-get install -y net-tools vim
    • RUN apt-get install -y libc++-dev
    • RUN pip install mcstatus ec2-metadata boto3
    • ADD start.sh /start.sh
    • ADD start.py /start.py
    • The second part is to copy the game engine from the previous stage --from=0 to the next build stage. In this case, you copy the game engine binaries with the two COPY Docker directives.
    • COPY --from=0 /BinLinux64.Dedicated/* /BinLinux64.Dedicated/
    • COPY --from=0 /BinLinux64.Dedicated/qtlibs /BinLinux64.Dedicated/qtlibs/
    • Finally, the game binaries are added as a separate layer on top of the game-engine layers, which concludes the build. It’s expected that constant daily changes are made to this layer, which is why it is packaged separately. If your game includes other abstractions, you can break this step into several discrete Docker image layers.
    • ADD MultiplayerSample_pc_Paks_Dedicated /BinLinux64.Dedicated/
  3. The post_build phase pushes the game Docker image to the centralized container registry for further deployment to the various regional EKS clusters. In this phase, tag and push the new image to the designated container registry in ECR.

- docker tag $IMAGE_REPO_NAME:$IMAGE_TAG

$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

docker push

$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

The game deployment process in EKS

At this point, you’ve pushed the updated image to the designated container registry in ECR (/$IMAGE_REPO_NAME:$IMAGE_TAG). This image is scheduled as a game server in an EKS cluster as game-server Kubernetes deployment, as described in the sample.

In this example, I use  imagePullPolicy: Always.


containers:
…
        image: /$IMAGE_REPO_NAME:$IMAGE_TAG/multiplayersample-build
        imagePullPolicy: Always
        name: multiplayersample
…

By using imagePullPolicy, you ensure that no one can circumvent Amazon ECR security. You can securely make ECR the single source of truth with regards to scheduled binaries. However, ECR to the worker nodes via kubelet, the node agent. Given the size of a whole image combined with the frequency with which it is pulled, that would amount to a significant additional cost to your project.

However, Docker layers allow you to update only the layers that were modified, preventing a whole image update. Also, they enable secure image distribution. In this example, only the layer MultiplayerSample_pc_Paks_Dedicated is updated.

Proposed CI/CD process

The following diagram shows an example end-to-end architecture of a full-scale game-server deployment using EKS as the orchestration system, ECR as the container registry, and CodeBuild as the build engine.

Game developers merge changes to the Git repository that include both the preconfigured game-engine binaries and the game artifacts. Upon merge events, CodeBuild builds a multistage game-server image that is pushed to a centralized container registry hosted by ECR. At this point, DevOps teams in different Regions continuously schedule the image as a game server, pulling only the updated layer in the game server image. This keeps the entire game-server fleet running the same game binaries set, making for a secure deployment.

 

Try it out

I published two examples to guide you through the process of building an Amazon EKS cluster and deploying a containerized game server with large binaries.

Conclusion

Adopting CI/CD in game development improves the software development lifecycle by continuously deploying quality-based updated game binaries. CI/CD in game development is usually hindered by the cost of distributing large binaries, in particular, by cross-regional deployments.

Non-containerized paradigms require deployment of the full set of binaries, which is an expensive and time-consuming task. Containerized game-server binaries with AWS build tools and Amazon EKS-based regional clusters of game servers enable secure and cost-effective distribution of large binary sets to enable increased agility in today’s game development.

In this post, I demonstrated a reduction of more than 90% of the network traffic required by implementing an effective CI/CD system in a large-scale deployment of multiplayer game servers.

from AWS Compute Blog

AWS CodePipeline Approval Gate Tracking

AWS CodePipeline Approval Gate Tracking

With the pursuit of DevOps automation and CI/CD (Continuous Integration/Continuous Delivery), many companies are now migrating their applications onto the AWS cloud to take advantage of the service capabilities AWS has to offer. AWS provides native tools to help achieve CI/CD and one of the most core services they provide for that is AWS CodePipeline. CodePipeline is a service that allows a user to build a CI/CD pipeline for the automated build, test, and deployment of applications.

A common practice in using CodePipeline for CI/CD is to be able to automatically deploy applications into multiple lower environments before reaching production. These lower environments for deployed applications could be used for development, testing, business validation, and other use cases. As a CodePipeline progresses through its stages, it is often required by businesses that there are manual approval gates in between the deployments to further environments.

Each time a CodePipeline reaches one of these manual approval gates, a human is required to log into the console and manually either approve (allow pipeline to continue) or reject (stop the pipeline from continuing) the gate. Often times different teams or divisions of a business are responsible for their own application environments and, as a result of that, are also responsible for either allowing or rejecting a pipeline to continue deployment in their environment via the relative manual approval gate.

A problem that a business may run into is trying to figure out a way to easily keep track of who is approving/rejecting which approval gates and in which pipelines. With potentially hundreds of pipelines deployed in an account, it may be very difficult to keep track of and record approval gate actions through manual processes. For auditing situations, this can create a cumbersome problem as there may eventually be a need to provide evidence of why a specific pipeline was approved/rejected on a certain date and the reasoning behind the result.

So how can we keep a long term record of CodePipeline manual approval gate actions in an automated, scalable, and organized fashion? Through the use of AWS CloudTrail, AWS Lambda, AWS CloudWatch Events, AWS S3, and AWS SNS we can create a solution that provides this type of record keeping.

Each time someone approves/rejects an approval gate within an CodePipeline, that API call is logged in CloudTrail under the event name of “PutApprovalResult”. Through the use of an AWS CloudWatch event rule, we can configure that rule to listen for that specific CloudTrail API action and trigger a Lambda function to perform a multitude of tasks. This what that CloudTrail event looks like inside the AWS console.


{
    "eventVersion": "1.05",
    "userIdentity": {
        "type": "AssumedRole",
        "principalId": "AAAABBBCCC111222333:newuser",
        "arn": "arn:aws:sts::12345678912:assumed-role/IamOrg/newuser",
        "accountId": "12345678912",
        "accessKeyId": "1111122222333334444455555",
        "sessionContext": {
            "attributes": {
                "mfaAuthenticated": "true",
                "creationDate": "2019-05-23T15:02:42Z"
            },
            "sessionIssuer": {
                "type": "Role",
                "principalId": "1234567093756383847",
                "arn": "arn:aws:iam::12345678912:role/OrganizationAccountAccessRole",
                "accountId": "12345678912",
                "userName": "newuser"
            }
        }
    },
    "eventTime": "2019-05-23T16:01:25Z",
    "eventSource": "codepipeline.amazonaws.com",
    "eventName": "PutApprovalResult",
    "awsRegion": "us-east-1",
    "sourceIPAddress": "1.1.1.1",
    "userAgent": "aws-internal/3 aws-sdk-java/1.11.550 Linux/4.9.137-0.1.ac.218.74.329.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.212-b03 java/1.8.0_212 vendor/Oracle_Corporation",
    "requestParameters": {
        "pipelineName": "testing-pipeline",
        "stageName": "qa-approval",
        "actionName": "qa-approval",
        "result": {
            "summary": "I approve",
            "status": "Approved"
        },
        "token": "123123123-abcabcabc-123123123-abcabc"
    },
    "responseElements": {
        "approvedAt": "May 23, 2019 4:01:25 PM"
    },
    "requestID": "12345678-123a-123b-123c-123456789abc",
    "eventID": "12345678-123a-123b-123c-123456789abc",
    "eventType": "AwsApiCall",
    "recipientAccountId": "12345678912"
}

When that CloudWatch event rule is triggered, the Lambda function that it executes can be configured to perform multiple tasks including:

  • Capture the CloudTrail event log data from that “PutApprovalResult” API call and log it into the Lambda functions CloudWatch log group.
  • Create a dated text file entry in a S3 bucket containing useful and unique information about the pipeline manual approval gate action.
  • Send out an email notification containing unique information about the pipeline manual approval gate action.

The CloudWatch Event Rule provides a way to narrow down and capture the specific CloudTrail event named “PutApprovalResult”. Below is a snippet of this event rule defined in AWS CloudFormation.

  ApprovalGateEventRule:
    Type: AWS::Events::Rule
    Properties: 
      Description: Event Rule that tracks whenever someone approves/rejects an approval gate in a pipeline
      EventPattern: 
        {
          "source": [
            "aws.codepipeline"
          ],
          "detail-type": [
            "AWS API Call via CloudTrail"
          ],
          "detail": {
            "eventSource": [
              "codepipeline.amazonaws.com"
            ],
            "eventName": [
              "PutApprovalResult"
            ]
          }
        }

The Lambda Function provides the automation and scalability needed to perform this type of approval gate tracking at any scale. The SNS topic provides the ability to send out email alerts whenever someone approves or rejects a manual approval gate in any pipeline.

The recorded text file entries in the S3 bucket provide the long term and durable storage solution to keeping track of CodePipeline manual approval gate results. To ensure an easy way to go back and discover those results, it is best to organize those entries in an appropriate manner such as by “pipeline_name/year/month/day/gate_name_timed_entry.txt“. An example of a recording could look like this:

PipelineApprovalGateActions/testing-pipeline/2019/05/23/dev-approval-APPROVED-11:50:45-AM.txt

Below is a diagram of a solution that can provide the features described above.

The source code and CloudFormation template for a fully built out implementation of this solution can be found here codepipeline-approval-gate-tracking.

To deploy this solution right now, click the Launch Stack button below.

The post AWS CodePipeline Approval Gate Tracking appeared first on Stelligent.

from Blog – Stelligent

IT Modernization and DevOps News Week in Review

IT Modernization and DevOps News Week in Review

IT Modernization DevOps News 12

The Uptime Institute announced findings of its ninth annual Data Center Survey, unveiling several interesting — and important — data points. Underscoring what many in the industry are feeling about the skill gap, the survey found that 61% of respondents said they had difficulty retaining or recruiting staff — up from 55% a year earlier. And, according to the synopsis, “while the lack of women working in data centers is well-known, the extent of the imbalance is notable” with one-quarter of respondents saying they had no women at all on their design, build or operations teams.

To stay up-to-date on DevOps automation, Cloud and Container Security, and IT Modernization subscribe to our blog:

Subscribe to the Flux7 Blog

When it comes to downtime, outages continue to cause significant problems. Without much improvement over the past year, 34% of respondents said they had an outage or severe IT service degradation in the past year. 10% said their most significant outage cost more than $1 million. When it comes to public cloud, 20% of operators reported that they would be more likely to put workloads in a public cloud if there were more visibility. While 50% of respondents already using public cloud for mission-critical applications said that they do not have adequate visibility.

DevOps News

  • Atlassian has announced Status Embed, a service designed to boost customer experience and communication by displaying the current state of services where customers are most likely to see it, such as your homepage, app or help center.
  • GitHub has brought to market repository templates to make boilerplate code management and distribution a “first-class citizen” on GitHub, according to the company.
  • HashiCorp announced the availability of Hashicorp Nomad 0.9.2, a workload orchestrator for deploying containerized and legacy apps across multiple regions or cloud providers. Nomad 09.9.2 includes preemption capabilities for service and batch jobs.
  • SDXCentral reports that, “VMware is developing a multi-cloud management tool that Joe Kinsella, chief technology officer of CloudHealth at VMware, describes as ‘Google docs for IT management, which is the ability to collaborate and share across an organization.’”

AWS News

  • Amazon announced that AWS Organizations now support tagging and untagging of AWS Accounts, allowing operators to assign custom attributes, or tags, to the AWS accounts they manage with AWS Organizations. According to AWS, the ability to attach tags such as owner name, project, business group, cost center, environment, and other values directly to an AWS account makes it easier for people in the organization to get information on particular AWS accounts without having to refer to a separate spreadsheet or other out-of-band method for tracking your AWS accounts.
  • Also introduced this week is AWS Systems Manager OpsCenter which is designed to help operators view, investigate, and resolve operational issues related to their environment from a central location.
  • Amazon has launched a new service to enhance recovery. Host Recovery for Amazon EC2 will now automatically restart instances on a new host in the event of an unexpected hardware failure on a Dedicated Host. Host Recovery will reduce the need for manual intervention, minimize recovery time and lower the operational burden for instances running on Dedicated Hosts. As a bonus, it has built-in integration with AWS License Manager to automatically track and manage licenses. There are no additional EC2 charges for using Host Recovery.
  • Last, our AWS Consulting team thought this foundational blog on Getting started with serverless was a good read for those of you looking to build serverless applications to take advantage of its agility and reduced TCO.

Flux7 News

  • Join AWS and Flux7 as they present a one day workshop on how Serverless Technology is impacting business now (and what you need to get started). Serverless technology on AWS is enabling companies by building modern applications with increased agility and lower total cost of ownership. Find additional information and register here.
  • Read CEO Dr. Suleman’s InformationWeek article, Five-Step Action Plan for DevOps at Scale in which he discusses how DevOps is achievable at enterprise scale if you start small, create a dedicated team and effectively use technology patterns and platforms.
  • Also published this week is Dr. Suleman’s take on Servant Leadership, as published in Forbes. In Why CIOs Should Have A Servant-Leadership Approach he shares why CIOs shouldn’t be in a position where they end up needing to justify their efforts. Read the article for the reason why. (No, it isn’t the brash conclusion you might think it is.)

Subscribe to the Flux7 Blog

Written by Flux7 Labs

Flux7 is the only Sherpa on the DevOps journey that assesses, designs, and teaches while implementing a holistic solution for its enterprise customers, thus giving its clients the skills needed to manage and expand on the technology moving forward. Not a reseller or an MSP, Flux7 recommendations are 100% focused on customer requirements and creating the most efficient infrastructure possible that automates operations, streamlines and enhances development, and supports specific business goals.

from Flux7 DevOps Blog

Backup and Restore in Same SQL Server RDS

Backup and Restore in Same SQL Server RDS

Written by SelvaKumar K, Sr. Database Administrator at Powerupcloud Technologies.

Problem Scenario :

One of our customers reported Production database has been corrupted, needs to back-up and restore database with a different name in the same RDS. But it’s not possible in the AWS RDS if we try to restore will get the below error.

Limitations :

Database <database_name> cannot be restored because there is already an existing database with the same family_guid on the instance

You can’t restore a backup file to the same DB instance that was used to create the backup file. Instead, restore the backup file to a new DB instance.

Approaches to Backup and Restore :

Option 1:

1.Import and Export into same RDS instance

The database is corrupted so we can’t proceed with this step

Option 2:

2. Backup and restore into different RDS using S3

2.1.Backup from production RDS Instance

exec msdb.dbo.rds_backup_database

@source_db_name=’selva’,

@s3_arn_to_backup_to=’arn:aws:s3:::mmano/selva.bak’,

@overwrite_S3_backup_file=1,

@type=’FULL’;

Check the status with below command

exec msdb.dbo.rds_task_status @db_name=’selva_selva’;

2.2.Restore into different RDS Instance or download from s3 and restore into local SQL Server instance

exec msdb.dbo.rds_restore_database

@restore_db_name=’selva’,

@s3_arn_to_restore_from=’arn:aws:s3:::mmano/selva.bak’;

2.3.In Another RDS instance or local instance

Restore the database into local dev or staging instance

a. Create a new database as selva_selva

b. Using Generate scripts wizard. Generate scripts and execute in the newly created database

Click Database → Tasks → Generate Scripts

Click Next → Select Specific Database Objects → Select required objects

Click Next → Save to a new query window

Click Advanced → Only Required to Change Script Indexes to True

Click Next → Next → Once the script is generated close this window

In Query, Window Scripts will be generated, Select Required Database and Execute the scripts

Direct Export and Import will not work because due to foreign key relationships, so we need to run below scripts and save the outputs in notepad

2.4. Prepare create and drop foreign Constraints for data load using below scripts

— — SCRIPT TO GENERATE THE CREATION SCRIPT OF ALL FOREIGN KEY CONSTRAINTS

declare @ForeignKeyID int

declare @ForeignKeyName varchar(4000)

declare @ParentTableName varchar(4000)

declare @ParentColumn varchar(4000)

declare @ReferencedTable varchar(4000)

declare @ReferencedColumn varchar(4000)

declare @StrParentColumn varchar(max)

declare @StrReferencedColumn varchar(max)

declare @ParentTableSchema varchar(4000)

declare @ReferencedTableSchema varchar(4000)

declare @TSQLCreationFK varchar(max)

— Written by Percy Reyes

www.percyreyes.com

declare CursorFK cursor for select object_id — , name, object_name( parent_object_id)

from sys.foreign_keys

open CursorFK

fetch next from CursorFK into @ForeignKeyID

while (@@FETCH_STATUS=0)

begin

set @StrParentColumn=’’

set @StrReferencedColumn=’’

declare CursorFKDetails cursor for

select fk.name ForeignKeyName, schema_name(t1.schema_id) ParentTableSchema,

object_name(fkc.parent_object_id) ParentTable, c1.name ParentColumn,schema_name(t2.schema_id) ReferencedTableSchema,

object_name(fkc.referenced_object_id) ReferencedTable,c2.name ReferencedColumn

from — sys.tables t inner join

sys.foreign_keys fk

inner join sys.foreign_key_columns fkc on fk.object_id=fkc.constraint_object_id

inner join sys.columns c1 on c1.object_id=fkc.parent_object_id and c1.column_id=fkc.parent_column_id

inner join sys.columns c2 on c2.object_id=fkc.referenced_object_id and c2.column_id=fkc.referenced_column_id

inner join sys.tables t1 on t1.object_id=fkc.parent_object_id

inner join sys.tables t2 on t2.object_id=fkc.referenced_object_id

where [email protected]

open CursorFKDetails

fetch next from CursorFKDetails into @ForeignKeyName, @ParentTableSchema, @ParentTableName, @ParentColumn, @ReferencedTableSchema, @ReferencedTable, @ReferencedColumn

while (@@FETCH_STATUS=0)

begin

set @[email protected] + ‘, ‘ + quotename(@ParentColumn)

set @[email protected] + ‘, ‘ + quotename(@ReferencedColumn)

fetch next from CursorFKDetails into @ForeignKeyName, @ParentTableSchema, @ParentTableName, @ParentColumn, @ReferencedTableSchema, @ReferencedTable, @ReferencedColumn

end

close CursorFKDetails

deallocate CursorFKDetails

set @StrParentColumn=substring(@StrParentColumn,2,len(@StrParentColumn)-1)

set @StrReferencedColumn=substring(@StrReferencedColumn,2,len(@StrReferencedColumn)-1)

set @TSQLCreationFK=’ALTER TABLE ‘+quotename(@ParentTableSchema)+’.’+quotename(@ParentTableName)+’ WITH CHECK ADD CONSTRAINT ‘+quotename(@ForeignKeyName)

+ ‘ FOREIGN KEY(‘+ltrim(@StrParentColumn)+’) ‘+ char(13) +’REFERENCES ‘+quotename(@ReferencedTableSchema)+’.’+quotename(@ReferencedTable)+’ (‘+ltrim(@StrReferencedColumn)+’)’ +’;’

print @TSQLCreationFK

fetch next from CursorFK into @ForeignKeyID

end

close CursorFK

deallocate CursorFK

— — SCRIPT TO GENERATE THE DROP SCRIPT OF ALL FOREIGN KEY CONSTRAINTS

declare @ForeignKeyName varchar(4000)

declare @ParentTableName varchar(4000)

declare @ParentTableSchema varchar(4000)

declare @TSQLDropFK varchar(max)

declare CursorFK cursor for select fk.name ForeignKeyName, schema_name(t.schema_id) ParentTableSchema, t.name ParentTableName

from sys.foreign_keys fk inner join sys.tables t on fk.parent_object_id=t.object_id

open CursorFK

fetch next from CursorFK into @ForeignKeyName, @ParentTableSchema, @ParentTableName

while (@@FETCH_STATUS=0)

begin

set @TSQLDropFK =’ALTER TABLE ‘+quotename(@ParentTableSchema)+’.’+quotename(@ParentTableName)+’ DROP CONSTRAINT ‘+quotename(@ForeignKeyName) + ‘;’

print @TSQLDropFK

fetch next from CursorFK into @ForeignKeyName, @ParentTableSchema, @ParentTableName

end

close CursorFK

deallocate CursorFK

Save the above script values and process the below steps

2.5.Execute the Drop foreign key scripts in the newly created database

2.6.Using import and export wizard transfer the data from the old database to the new database

Select Data Source for Data Pull

Select the Destination Server to Data Push

Click Next → Copy data from one or more tables or views

Click Next → Select the required tables to copy the data

Click Next and Verify the Source and Destination

2.7. Once data load is completed, Execute the Create foreign key constraint scripts

Final Step:

3.Backup the database Database and restore into production RDS instance with a different name

from Powerupcloud Tech Blog – Medium

Digital Transformation & The Agile Enterprise in Oil and Gas

Digital Transformation & The Agile Enterprise in Oil and Gas

Digital Transformation Agile Enterprise Oil Gas

According to the World Economic Forum, digital transformation could unlock approximately $1.6 trillion of value for the Oil and Gas industry, its customers and society. This value is derived from greater productivity, better system efficiency, savings from reduced resource usage, and fewer spills and emissions. Yet, the journey to these digital transformation benefits begins with a proverbial first step which can be elusive for large oil and gas enterprises who have vast legacy technologies and complicated organizational structures to navigate.

At Flux7, we are proponents of the Agile Enterprise. While much work has been put into defining what makes an enterprise agile, we are fans of the research by McKinsey, who found a common set of five disciplines that agile enterprises share in common. Defined by their practices more than anything else, these agile organizations deploy an agile culture and agile technology to effectively support their digital transformation initiatives.

Becoming an Agile Enterprise is critically important within the oil and gas industries where unparalleled transformation is happening in rapid fashion. From new extraction methods to IoT and changing customer expectations, the industry is evolving quickly. For long-term, scalable success, digital efforts must be a cornerstone as organizations transition to becoming an Agile Enterprise.

DevOps for Oil and Gas

Equal parts people, process and technology, DevOps is a key component of marrying digital and agile. With a solid cloud-based DevOps platform, automation to streamline processes and ensure they are followed, and a Center of Excellence in place to help train teams, oil and gas enterprises have a roadmap to digital transformation success with DevOps.

For a more detailed road map to DevOps success across the enterprise, please download our white paper:

5 Steps to Enterprise DevOps at Scale

Let’s explore a few examples of organizations in the energy industry that have applied DevOps best practices to facilitate digital transformation and reach greater enterprise agility:

TechnipFMC, a world leader in project management, engineering, and construction for the energy industry, was looking to ensure compliance and security for cloud computing for its global sites and the perimeter networks that support its client-facing applications. To help accomplish this goal, TechnipFMC wanted to create a consistent, self-service solution to enable its global IT employees to easily provision cloud infrastructure and migrate externally facing Microsoft SharePoint sites to the cloud. With templates and automation, TechnipFMC can now enforce security and compliance standards in every deployment, which enhances overall perimeter network security. In addition, TechnipFMC is expecting to reduce operational costs while growing operational effectiveness. Listen as TechnipFMC’s John Hutchinson shares the experience at re:Invent or read the full Technip story.

A renewable energy leader had two parallel goals: It wanted to use an AWS cloud migration strategy as an opportunity to overhaul its business systems and in the process, the company wanted to build standardization. Moreover, it aimed to increase developer agility, grow global access for its workers and decrease capital expenses. Based on its application portfolio TCO analysis, a lift and shift migration approach was pursued. With 80% of its applications now defined by a small number of templates, the company has standardized its software builds, ensuring security best practices are followed by default. The enterprise has increased its time to innovation, speed to market and operational efficiencies. Preview their story here.

Fugro, which collects and provides highly specialized interpretation of oceanic geological data, is able to keep skilled staff onshore using an Internet of Things (IoT) platform model. Called OARS, its cloud-based project provides faster interpretation of data and decisions. With continuous delivery of code, its vessels are sure to always have the newest software features at their fingertips. And, new environments which previously took weeks to build, now launch in a matter of hours, providing better access to information across global regions. Read the full Fugro case study here.

A global oil field services company was looking to embrace digitalization with a SaaS model solution that sought to integrate data and business process management and in the process address operational workflows that would lead to greater scalability and more efficient delivery. The firm implemented a pipeline for delivering AMIs that are provisioned using Ansible and Docker containers, thereby streamlining complex workflows, allowing the firm to reap efficiencies of scale from automation, meet tight deadlines and ensure SOC2 compliance. Now the firm has pipelines for delivering resources and processes to build and deploy current and future solutions — ensuring digital transformation in the short- and long- term.

We are living in an uncertain, complex and constantly changing world. To stay competitive, oil and gas enterprises are expected to react to changes at unprecedented speed, which has ushered in a strong focus on becoming an agile enterprise. Effectively balance stability with ever-evolving customer needs, technologies, and overall market conditions with DevOps best practices as your foundation to scalable digital transformation.

For five tips on how to apply DevOps in your Oil, Gas or Energy enterprise, check out this article our CEO, Dr. Suleman, recently wrote for Oilman magazine. (Note that a free subscription is required.)  Or, you can find additional resources on our Energy resource page.

Subscribe to the Flux7 Blog

from Flux7 DevOps Blog

WFSC and AlwaysOn Availability Groups onAWS Cloud

WFSC and AlwaysOn Availability Groups onAWS Cloud

Written by SelvaKumar K, Sr. Database Administrator at Powerupcloud Technologies.

What is Failover Clustering?

A failover cluster is a group of independent computers that work together to increase the availability and scalability of clustered roles (formerly called clustered applications and services). The clustered servers (called nodes) are connected by physical cables and by software. If one or more of the cluster nodes fail, other nodes begin to provide service (a process known as failover). In addition, the clustered roles are proactively monitored to verify that they are working properly. If they are not working, they are restarted or moved to another node

What is AlwaysOn Availability Group?

An availability group supports a replicated environment for a discrete set of user databases, known as availability databases. You can create an availability group for high availability (HA) or for read-scale. An HA availability group is a group of databases that fail over together. A read-scale availability group is a group of databases that are copied to other instances of SQL Server for read-only workload

What we cover in this,

  1. Implementing Windows Failover Cluster (WFSC) in AWS Cloud and configure Alwayson Availability Group between two Windows Servers

2. As like On-Prem server, we can install and configure the WSFC Cluster and SQL Server 2017 “AlwaysOn Availability Group” in AWS Cloud to access the SQL Database Server outside the world with AG Listener on 99.99% uptime.

3. To implemented SQL Server Alwayson with Minimal configuration instances and SQL Server 2017 Developer Edition. We have configured Alwayson without shared storage If you want to do the shared storage use storage gateway in AWS service.

Architecture:

Implement Prerequisites from AWS :

  1. AWS VPC ( ag-sql-vpc )

2. AWS Subnets ( two private and two public subnets )

Launch and Configure the server Infrastructure :

It requires three EC2 instances for Alwayson Setup and it is in different Availability Zones, Minimum requirement for SQL Server Instances is t2.small

Our setup is configured without shared storage, add 50 GB additional disk on each EC2 instance. In addition, secondary IP’s need for windows cluster resource and AG Listener

Disk and Secondary IP for the EC2 Instances :

Security Groups :

Each EC2 instance security group allowed for all ports between Active Directory and SQL Server Instances

Implement and configure Active Directory Domain Service :

Active Directory domain ( agsql.com ) is to be configured in ag-sql-AD server, add SQL Server instances ( ag-sql-node1 and ag-sql-node2 ) in agsql.com domain

Implement and Configure WFSC :

We need to do multiple reboots once SQL Server instance configured with agsql.com active directory domain account. Let’s start to configure failover clustering roles on each server

Failover clustering role needs to be added in both servers and start creating clusters with your own steps

Adding the SQL Server nodes in Create Cluster and perform all necessary tests for windows cluster creation

Assign the secondary IP’s for Windows cluster and bring online the cluster resources. Once cluster resource is ready, parallelly start installing SQL Server 2017 Developer editions in SQL Server Instances

Once SQL Server Installation is completed, Enable AlwaysOn Availability Group in SQL Server Service and restart the SQL Service on Both SQL Server Instances.

So, We are ready with Windows failover clustering and SQL Server Setup on both instances. Start creating AlwaysOn Availability Group and Configure AG Listener

Step 1: Specify Name for the Always on Group

Step 2: Connect the replica Node for AlwaysOn Group

Step 3: Specify the secondary IP addresses for AG Listener

So, aglistener will be added in Active Directory DNS Name and it will be connected from the outside world to access the SQL Servers with respected IP Addresses. We will able to ping or telnet the aglistner from the agsql.com domain account

Step 4: AlwaysOn Dashboard to check Database Sync Status

DNS and Active Directory Computers Configuration didn’t cover in this Setup, those are automatically created in the Active Directory Server

Finally, AlwaysOn Availability Group Ready in AWS Cloud !!!

from Powerupcloud Tech Blog – Medium

Integrating AWS X-Ray with AWS App Mesh

Integrating AWS X-Ray with AWS App Mesh

This post is contributed by Lulu Zhao | Software Development Engineer II, AWS

 

AWS X-Ray helps developers and DevOps engineers quickly understand how an application and its underlying services are performing. When it’s integrated with AWS App Mesh, the combination makes for a powerful analytical tool.

X-Ray helps to identify and troubleshoot the root causes of errors and performance issues. It’s capable of analyzing and debugging distributed applications, including those based on a microservices architecture. It offers insights into the impact and reach of errors and performance problems.

In this post, I demonstrate how to integrate it with App Mesh.

Overview

App Mesh is a service mesh based on the Envoy proxy that makes it easy to monitor and control microservices. App Mesh standardizes how your microservices communicate, giving you end-to-end visibility and helping to ensure high application availability.

With App Mesh, it’s easy to maintain consistent visibility and network traffic control for services built across multiple types of compute infrastructure. App Mesh configures each service to export monitoring data and implements consistent communications control logic across your application.

A service mesh is like a communication layer for microservices. All communication between services happens through the mesh. Customers use App Mesh to configure a service mesh that contains virtual services, virtual nodes, virtual routes, and corresponding routes.

However, it’s challenging to visualize the way that request traffic flows through the service mesh while attempting to identify latency and other types of performance issues. This is particularly true as the number of microservices increases.

It’s in exactly this area where X-Ray excels. To show a detailed workflow inside a service mesh, I implemented a tracing extension called X-Ray tracer inside Envoy. With it, I ensure that I’m tracing all inbound and outbound calls that are routed through Envoy.

Traffic routing with color app

The following example shows how X-Ray works with App Mesh. I used the Color App, a simple demo application, to showcase traffic routing.

This app has two Go applications that are included in the AWS X-Ray Go SDK: color-gateway and color-teller. The color-gateway application is exposed to external clients and responds to http://service-name:port/color, which retrieves color from color-teller. I deployed color-app using Amazon ECS. This image illustrates how color-gateway routes traffic into a virtual router and then into separate nodes using color-teller.

 

The following image shows client interactions with App Mesh in an X-Ray service map after requests have been made to the color-gateway and to color-teller.

Integration

There are two types of service nodes:

  • AWS::AppMesh::Proxy is generated by the X-Ray tracing extension inside Envoy.
  • AWS::ECS::Container is generated by the AWS X-Ray Go SDK.

The service graph arrows show the request workflow, which you may find helpful as you try to understand the relationships between services.

To send Envoy-generated segments into X-Ray, install the X-Ray daemon. The following code example shows the ECS task definition used to install the daemon into the container.

{
    "name": "xray-daemon",

    "image": "amazon/aws-xray-daemon",

    "user": "1337",

    "essential": true,

    "cpu": 32,

    "memoryReservation": 256,

    "portMappings": [

        {

            "hostPort": 2000,

            "containerPort": 2000,

            "protocol": "udp"

         }

After the Color app successfully launched, I made a request to color-gateway to fetch a color.

  • First, the Envoy proxy appmesh/colorgateway-vn in front of default-gateway received the request and routed it to the server default-gateway.
  • Then, default-gateway made a request to server default-colorteller-white to retrieve the color.
  • Instead of directly calling the color-teller server, the request went to the default-gateway Envoy proxy and the proxy routed the call to color-teller.

That’s the advantage of using the Envoy proxy. Envoy is a self-contained process that is designed to run in parallel with all application servers. All of the Envoy proxies form a transparent communication mesh through which each application sends and receives messages to and from localhost while remaining unaware of the broader network topology.

For App Mesh integration, the X-Ray tracer records the mesh name and virtual node name values and injects them into the segment JSON document. Here is an example:

“aws”: {
	“app_mesh”: {
		“mesh_name”: “appmesh”,
		“virtual_node_name”: “colorgateway-vn”
	}
},

To enable X-Ray tracing through App Mesh inside Envoy, you must set two environment variable configurations:

  • ENABLE_ENVOY_XRAY_TRACING
  • XRAY_DAEMON_PORT

The first one enables X-Ray tracing using 127.0.0.1:2000 as the default daemon endpoint to which generated segments are sent. If the daemon you installed listens on a different port, you can specify a port value to override the default X-Ray daemon port by using the second configuration.

Conclusion

Currently, AWS X-Ray supports SDKs written in multiple languages (including Java, Python, Go, .NET, and .NET Core, Node.js, and Ruby) to help you implement your services. For more information, see Getting Started with AWS X-Ray.

from AWS Compute Blog

Upskill Your Team to Address the Cloud, Kubernetes Skills Gap

Upskill Your Team to Address the Cloud, Kubernetes Skills Gap

Upskill Your Team Kubernetes Cloud Skills Gap This article originally appeared on Forbes

According to CareerBuilder’s Mid Year Job Forecast, 63% of U.S. employers planned to hire full-time, permanent workers in the second half of 2018. This growing demand coupled with low unemployment is driving a real talent shortage. The technology field, in particular, is experiencing acute pain when it comes to finding skilled talent. Indeed, more than five million IT jobs are expected to be added globally by 2027, reports BusinessInsider.

Of these five million jobs, the two most requested tech skills according to research by DICE are for Kubernetes and Terraform with the company also finding that DevOps Engineer has quickly moved up the ranks of the top paid IT careers. As companies invest in IT modernization with approaches like Agile and DevOps and technologies like cloud computing and containers, skills to support these initiatives are in increasing demand.

The problem is not set to get better in the near or mid-term with many companies reporting that it’s taking longer to find candidates with the right technology and business skills for driving digital innovation. A survey by OpsRamp found that 94% of HR departments take at least 30 days to fill an empty position and 25% report taking 90 days or more. With internal pressures for innovation that won’t wait out a protracted hiring process, I encourage leaders to look internally, using two key levers to help grow innovation.

Upskill Your Team

One way to work around a skills gap within the organization is to upskill the team. Rather than hiring a new headcount that is already difficult to find, a solution is to train your existing team. (Or a few members of the team who can in turn train others.) While there are a variety of training options — from classroom training to virtual classes and more — at Flux7, our experience has shown that hands-on training works best for technical skills like Terraform or Kubernetes. 

Specifically, a successful model consists of the following:

  • Find a coach that can work hand-in-hand with your team
  • Identify a small but impactful project for the coach and team to work on together with the goal of having the coach train the team along the way
  • Start the project with the coach taking the initial lead sharing what they are doing, why and how with your team shadowing
  • Slowly transition over the course of the project to the coach assigning tasks to your team, with your employees ultimately leading tasks and checking in with the coach as needed.


In this way, teams are able to learn in a practical, hands-on manner, taking ownership of the environment as they learn and grow — all while having access to an expert who can guide, correct and reinforce learning.

In addition to gaining much-needed skills in-house, upskilling your existing team has retention benefits. In a survey of tech professionals by DICE, 71% said that training and education are important to them, yet only 40% currently have company-paid training and education. Underscoring the importance of training to technologists, 45% who are satisfied with their job receive training; conversely, only 28% of those who are dissatisfied with their job receive training.

Grow Productivity with Automation

In addition to upskilling your team, automation is important to continue to expand your capacity. Approaches like DevOps embrace the use automation to create continuous integration and delivery, in the process reducing handoffs and speeding time to market. In addition, the use of automation can keep employees from working on tactical, repeatable tasks and instead focused on strategic, business-impacting work.

Let me give you an example. I recently had the opportunity to work with a large semiconductor company who sought to bolster its team’s cloud, container and Kubernetes talents in order to support a new AWS initiative. Working hands-on in the cloud to automate its pipelines and other processes, the company was able to streamline tasks that formerly took days to mere minutes.

In addition to working elbow-to-elbow with a cloud coach on the project, the company also initiated weekly knowledge transfer sessions to the team to ensure everyone had received the same level of training and was ready for the next week’s work. At the end of the project, the team was ready to train others in the organization and felt confident that they were building better products faster as their time was focused less on tactical work and more on making a strategic impact. Another benefit to the team — and company as a whole — is that by taking a cross-functional DevOps approach, employees felt that communication improved making their work more enjoyable.

In a recent poll of over 70,000 developers, HackerRank found that salary wasn’t the lead driver of what they look for in a job. Rather, the most important factors for developers, across all job levels and functions, was the opportunity for professional growth and the opportunity to work on interesting problems. The application of automation not only increases developer productivity and code throughput but provides the space to work on interesting projects that leads to greater job satisfaction and retention.

With competition growing for employees skilled in Kubernetes, Terraform, DevOps and more, growing your own is an increasingly attractive approach. UC Berkeley found that the average cost to hire a new professional employee may be as high as $7,000 (while replacement costs can be as great as 2.5x salary) not to mention lost opportunity costs as organizations place projects on hold as they vie to find skilled talent. Upskilling employees, combined with greater automation, can increase code throughput and get more projects to market faster, maximizing near-term opportunity. Just as importantly, presenting employees with new skills and the opportunity to work on interesting work has proven to increase job satisfaction and retention.

Learn more about addressing the skills gap, building cloud-native infrastructure and more on the Flux7 DevOps blog. Subscribe today:

Subscribe to the Flux7 Blog

from Flux7 DevOps Blog