Tag: Hashicorp Blog

Using Terraform Cloud Remote State Management

Using Terraform Cloud Remote State Management

We recently announced Terraform 0.12 and Terraform Cloud Remote State Management. Both these releases provide Terraform users a better experience writing and collaborating on Infrastructure as Code. This blog post will look at some motivations for using Terraform Cloud and describe how it works.

What is Terraform State?

To explain the value of Terraform Cloud, it’s important to understand the concept of state in Terraform. Terraform uses state to map your Terraform code to the real-world resources that it provisions. For example, you could use the following code to create an AWS EC2 instance:

hcl
resource "aws_instance" "web" {
ami = "ami-e6d9d68c"
instance_type = "t2.micro"
}

When you run terraform apply on this configuration file, Terraform will make an API call to AWS to create an EC2 instance and AWS will return the unique ID of that instance (ex. i-0ad17607e5ee026d0). Terraform needs to record that ID somewhere so that later, it can make API calls to change or delete that instance.

To store this information, Terraform uses a state file. For the above code, the state file will look something like:

hcl
{
...
"resources": {
"aws_instance.web": {
"type": "aws_instance",
"primary": {
"id": "i-0ad17607e5ee026d0",
...
}

Here you can see that the resource aws_instance.web from the Terraform code is mapped to the instance ID i-0ad17607e5ee026d0.

Remote State

By default, Terraform writes its state file to your local filesystem. This works well for personal projects, but once you start working with a team, things start to get more challenging. In a team, you need to make sure everyone has an up to date version of the state file and ensure that two people aren’t making concurrent changes.

Remote state solves those challenges. Remote state is simply storing that state file remotely, rather than on your local filesystem. With a single state file stored remotely, teams can ensure they always have the most up to date state file. With remote state, Terraform can also lock the state file while changes are being made. This ensures all changes are captured, even if concurrent changes are being attempted.

Configuring remote state in Terraform has always been an involved process. For example, you can store state in an S3 bucket, but you need to create the bucket, properly configure it, set up permissions, create a DynamoDB table for locking, and then ensure everyone has proper credentials to write to it.

As a result, setting up remote state can be a stumbling block as teams adopt Terraform.

Easy Remote State Set Up with Terraform Cloud

Unlike other remote state solutions that require complicated setup, Terraform Cloud offers an easy way to get started with remote state:

  • Step 0 — Sign up for a Terraform Cloud account here

  • Step 1 —  An email will be sent to you, follow the link to activate your free Terraform Cloud account.

  • Step 2 — When you log in, you’ll land on a page where you can create your organization or join an existing one if invited by a colleague.

[](

  • Step 3 — Next, go into User Settings and generate a token.

  • Step 4 — Take this token and create a local ~/.terraformrc file:


credentials "app.terraform.io" {
token = "mhVn15hHLylFvQ.atlasv1.jAH..."
}

  • Step 5 — Configure Terraform Cloud as your backend

In your Terraform project, add a terraform block to configure your backend:


terraform {
backend "remote" {
organization = "my-org" # org name from step 2.
workspaces {
name = "my-app" # name for your app's state.
}
}
}

  • Step 6— Run terraform init and you’re done.

Your state is now being stored in Terraform Cloud. You can see the state in the UI:

Fully Featured State Viewer

Terraform Cloud offers a fully featured state viewer to gain insight into the state of your infrastructure:

This maintains versions of your Terraform state allowing you to download an old version if needed. Likewise, it provides audit logs to know who changed what and when.

You can view the full state file at each point in time:

You can also see the diff of what changed:

Manual Locking

Terraform Cloud also includes the ability to manually lock your state. This is useful if you’re making large changes to your infrastructure and you want to prevent coworkers from modifying that infrastructure while you’re in the middle of your work.

You can lock and unlock states directly in the UI:

While the state is locked, Terraform operations will receive an error:

Conclusion

We’re pleased to offer Remote State Management with Terraform Cloud free to our users. Sign up for an account here: https://app.terraform.io/signup

from Hashicorp Blog

HashiCorp Nomad 0.9.2

HashiCorp Nomad 0.9.2

We are pleased to announce the availability of Hashicorp Nomad 0.9.2.

Nomad is a flexible workload orchestrator that can be used to easily deploy both containerized and legacy applications across multiple regions or cloud providers. Nomad is easy to operate and scale, and integrates seamlessly with HashiCorp Consul for service discovery and HashiCorp Vault for secrets management.

Nomad CVE-2019-12618

Nomad 0.9.2 addresses a privilege escalation vulnerability that enables the exec task driver to run with full Linux capabilities such that processes can escalate to run as the root user. This vulnerability exists in Nomad versions 0.9 and 0.9.1. Other task drivers including the Docker task driver are unaffected. See the official announcement for more details.

Nomad 0.9.2

Nomad 0.9.2 builds upon the work done in Nomad 0.9, with features that enhance the debuggability of running tasks, as well as allocation lifecycle management commands and deployment enhancements. Nomad 0.9.2 also includes an Enterprise feature – preemption capabilities for service and batch jobs.

The new features in Nomad 0.9.2 include:

  • Alloc exec: Run commands in a running allocation. Use cases are for inspecting container state, debugging a failed application without needing ssh access into the node that's running the allocation.
  • Alloc restart: Enables performing an in place restart of an entire allocation or individual task. Allocations and tasks can be restarted from the Nomad UI as well.
  • Alloc stop: Enables stopping an entire allocation. Stopped allocations are rescheduled elsewhere in the cluster. Allocations can be stopped from the Nomad UI as well.
  • Alloc signal: Enables sending a signal to all tasks in an allocation, or signalling an individual task within an allocation.
  • Canary Auto-promotion: The update stanza now include a new auto promote flag that causes deployments to automatically promote themselves when all canaries become healthy.
  • Preemption for Service/Batch jobs: Nomad enterprise adds preemption capabilities to service and batch jobs.
  • Preemption Web UI Visibility: Preemption status is shown in various pages in the UI such as the allocation list page and the job status page.
  • UI Search: The Nomad UI now supports filtering jobs by type/prefix/status/datacenter, as well as searching clients by class, status, datacenter, or eligibility flags.

This release includes a number of bug fixes as well as improvements to the Web UI, the system scheduler, the CLI, and other Nomad components. The CHANGELOG provides a full list of Nomad 0.9 features, enhancements, and bug fixes.

Conclusion

We are excited to share this release with our users. Visit the Nomad website to learn more.

from Hashicorp Blog

Using the Kubernetes and Helm Providers with Terraform 0.12

Using the Kubernetes and Helm Providers with Terraform 0.12

With Terraform 0.12 generally available, new configuration language improvements allow additional templating of Kubernetes resources. In this post, we will demonstrate how to use Terraform 0.12, the Kubernetes Provider, and the Helm provider for configuration and deployment of Kubernetes resources.

The following examples demonstrate the use of Terraform providers to deploy additional services and functions for supporting applications:

  • ExternalDNS deployment, to set up DNS aliases for services or ingresses.
  • Fluentd daemonset, for sending application logs.
  • Consul Helm chart, for service mesh and application configuration.

Deployment of these services happens after creating the infrastructure and Kubernetes cluster with a Terraform cloud provider.

ExternalDNS Deployment

A Kubernetes deployment maintains the desired number of application pods. In this example, we create a Kubernetes deployment with Terraform that will interpolate identifiers and attributes from resources created by the cloud provider. This alleviates the need for separate or additional automation to retrieve attributes such as hosted zone identifiers, domain names, and CIDR blocks.

We can use ExternalDNS to create a DNS record for a service upon creation or update. ExternalDNS runs in Kubernetes as a deployment. First, we translate the Kubernetes deployment configuration file for ExternalDNS to Terraform’s configuration language (called HCL). This allows Terraform to display the differences in each section as changes are applied. The code below shows the Terraform kubernetes_deployment resource to create ExternalDNS.

“`hcl
locals {
name = "external-dns"
}

resource "awsroute53zone" "dev" {
name = "dev.${var.domain}"
}

resource "kubernetesdeployment" "externaldns" {
metadata {
name = local.name
namespace = var.namespace
}

spec {
selector {
match_labels = {
app = local.name
}
}

template {
  metadata {
    labels = {
      app = local.name
    }
  }

  spec {
    container {
      name  = local.name
      image = var.image
      args = concat([
        "--source=service",
        "--source=ingress",
        "--domain-filter=${aws_route53_zone.dev.name}",
        "--provider=${var.cloud_provider}",
        "--policy=upsert-only",
        "--registry=txt",
        "--txt-owner-id=${aws_route53_zone.dev.zone_id}"
      ], var.other_provider_options)
    }

    service_account_name = local.name
  }
}

strategy {
  type = "Recreate"
}

}
}
“`

Note that we use the Terraform 0.12 first class expressions, such as var.namespace or local.name, without the need for variable interpolation syntax. Furthermore, we resource reference the hosted zone resource we created with the aws_route53_zone. The dynamic reference to the AWS resource removes our need to separately extract and inject the attributes into a Kubernetes manifest.

Kubernetes DaemonSets

To collect application logs, we can deploy Fluentd as a Kubernetes daemonset. Fluentd collects, structures, and forwards logs to a logging server for aggregation. Each Kubernetes node must have an instance of Fluentd. A Kubernetes daemonset ensures a pod is running on each node. In the following
example, we configure the Fluentd daemonset to use Elasticsearch as the logging server.

Configuring Fluentd to target a logging server requires a number of environment variables, including ports, hostnames, and usernames. In versions of Terraform prior to 0.12, we duplicated blocks such as volume or env and added different parameters to each one. The excerpt below demonstrates the Terraform version <0.12 configuration for the Fluentd daemonset.

“`hcl
resource "kubernetes_daemonset" "fluentd" {
metadata {
name = "fluentd"
}

spec {
template {
spec {
container {
name = "fluentd"
image = "fluent/fluentd-kubernetes-daemonset:elasticsearch"

      env {
        name  = "FLUENT_ELASTICSEARCH_HOST"
        value = "elasticsearch-logging"
      }

      env {
        name  = "FLUENT_ELASTICSEARCH_PORT"
        value = "9200"
      }

      env {
        name  = "FLUENT_ELASTICSEARCH_SCHEME"
        value = "http"
      }

      env {
        name  = "FLUENT_ELASTICSEARCH_USER"
        value = "elastic"
      }

      env {
        name  = "FLUENT_ELASTICSEARCH_PASSWORD"
        value = "changeme"
      }
    }
  }
}

}
}
“`

Using Terraform 0.12 dynamic blocks, we can specify a list of environment variables and use a for_each loop to create each env child block in the daemonset.

“`hcl
locals {
name = "fluentd"

labels = {
k8s-app = "fluentd-logging"
version = "v1"
}

env_variables = {
"HOST" : "elasticsearch-logging",
"PORT" : var.port,
"SCHEME" : "http",
"USER" : var.user,
"PASSWORD" : var.password
}
}

resource "kubernetes_daemonset" "fluentd" {
metadata {
name = local.name
namespace = var.namespace

labels = local.labels

}

spec {
selector {
match_labels = {
k8s-app = local.labels.k8s-app
}
}

template {
  metadata {
    labels = local.labels
  }

  spec {
    volume {
      name = "varlog"

      host_path {
        path = "/var/log"
      }
    }

    volume {
      name = "varlibdockercontainers"

      host_path {
        path = "/var/lib/docker/containers"
      }
    }

    container {
      name  = local.name
      image = var.image

      dynamic "env" {
        for_each = local.env_variables
        content {
          name  = "FLUENT_ELASTICSEARCH_${env.key}"
          value = env.value
        }
      }

      resources {
        limits {
          memory = "200Mi"
        }

        requests {
          cpu    = "100m"
          memory = "200Mi"
        }
      }

      volume_mount {
        name       = "varlog"
        mount_path = "/var/log"
      }

      volume_mount {
        name       = "varlibdockercontainers"
        read_only  = true
        mount_path = "/var/lib/docker/containers"
      }
    }

    termination_grace_period_seconds = 30
    service_account_name             = local.name
  }
}

}
}
“`

In this example, we specify a map with a key and value for each environment variable. The dynamic "env" block iterates over entry in the map, retrieves the key and value, and creates an env child block. This minimizes duplication in configuration and allows any number of environment variables to be added or removed.

Managing Helm Charts via Terraform

For services packaged with Helm, we can also use Terraform to deploy charts and run tests. Helm provides application definitions in the form of charts. Services or applications often have official charts for streamlining deployment. For example, we might want to use Consul, a service mesh that provides a key-value store, to connect applications and manage configuration in our Kubernetes cluster.

We can use the official Consul Helm chart, which packages the necessary Consul application definitions for deployment. When using Helm directly, we would first deploy a component called Tiller for version 2 of Helm. Then, we would store the Consul chart locally, deploy the chart with helm install, and test the deployment with helm test.

When using Terraform Helm provider, the provider will handle deployment of Tiller, installation of a Consul cluster via the chart, and triggering of acceptance tests. First, we include an option to install_tiller with the Helm provider.

hcl
provider "helm" {
version = "~> 0.9"
install_tiller = true
}

Next, we use the Terraform helm_release resource to deploy the chart. We pass the variables to the Helm chart with set blocks. We also include a provisioner to run a set of acceptance tests after deployment, using helm test. The acceptance tests confirm if Consul is ready for use.

“`hcl
resource "helm_release" "consul" {
name = var.name
chart = "${path.module}/consul-helm"
namespace = var.namespace

set {
name = "server.replicas"
value = var.replicas
}

set {
name = "server.bootstrapExpect"
value = var.replicas
}

set {
name = "server.connect"
value = true
}

provisioner "local-exec" {
command = "helm test ${var.name}"
}
}
“`

When we run terraform apply, Terraform deploys the Helm release and runs the tests. By using Terraform to deploy the Helm release, we can pass attributes from infrastructure resources to the curated application definition in Helm and run available acceptance tests in a single, common workflow.

Conclusion

We can use Terraform to not only manage and create Kubernetes clusters but also create resources on clusters with the Kubernetes API or Helm. We examined how to interpolate resource identifiers and attributes from infrastructure resources into Kubernetes services, such as ExternalDNS. Furthermore, we used improvements in Terraform 0.12 to minimize configuration and deploy a Fluentd daemonset. Finally, we deployed and tested Consul using the Terraform Helm provider.

Leveraging this combination of providers allows users to seamlessly pass attributes from infrastructure to Kubernetes clusters and minimize additional automation to retrieve them. For more information about Terraform 0.12 and its improvements, see our blog announcing Terraform 0.12. To learn more about providers, see the Kubernetes provider reference and the Helm provider reference.

from Hashicorp Blog

Introducing the HashiCorp Community Portal

Introducing the HashiCorp Community Portal

Today we’re announcing the HashiCorp Community Portal, a single source for all members of our community to learn, share, and discuss HashiCorp products.

The community portal includes a new discussion forum, video content, product guides, local in-person events (HashiCorp User Groups), and more. We recognize that individuals want to engage with the community in different ways at different moments and the community portal is meant to reflect this.

Effective today, we will no longer officially be endorsing the Gitter channels for our open source projects. We will also be changing the nature of our Google Groups to focus on outbound communication and redirecting the inbound communication to the forum. Please read more to learn why we’ve made this decision and recommended steps for the community.

A Unified Community

The HashiCorp Community Portal and new HashiCorp discussion forum are a single shared resource for a unified community rather than a per-product website.

HashiCorp builds multiple separate open source projects that each have millions of downloads. But while the projects are separate, the philosophy and vision is unified and we’ve found that most of our community share common goals or use multiple products. Having a single destination for community reflects this.

We still have separate categories for each product so you can easily find Terraform-focused content, Vault-focused content, etc. But having it in a single location will make it easier to find cross-product content such as “Using Terraform with Vault” or just learning more about an adjacent topic.

Evolving Our Approach

Today we introduced a new discussion forum. Alongside this launch, we will be shutting down our Gitter channels and will no longer provide any /official/ real-time discussion. There are many community-run options such as Slack channels, and we encourage our community to join those.

We love talking with our community in real time. We want to meet you, learn about your use cases, help you, hear your great ideas, and more. But we found that real-time channels introduced unhealthy expectations and demands on many people.

Real-time chat often sets the expectation that you’ll receive 1:1 help at a moment’s notice. This put a difficult burden on ourselves and members of our community to be constantly present and checking channels in order to respond. This introduces a lot of context switching and takes away from our ability to focus on fixing bugs or otherwise improving the products.

Additionally, the chat is often poorly search indexed if at all. This means that any help or discussion in a chat system provides only short term benefit before being lost forever. We found ourselves repeating the answers to the same questions many times in chat systems without being able to link back to a concise historical solution.

Official chat is therefore moving to our new discussion forums. These forums are staffed with HashiCorp employees and a forum will give us a better long-form way to answer questions and link to them.

Next

We’re thankful for the active and friendly community we have and are looking forward to providing more programs and resources to foster that community. Launching the HashiCorp Community Portal is the first step and gives us a central source to make future changes.

Please visit the new HashiCorp Community Portal and discussion forums today!

from Hashicorp Blog

HashiCorp at DevOps Barcelona

HashiCorp at DevOps Barcelona

DevOps Barcelona is coming up and we are pleased to sponsor this community-organized conference.

There is an excellent speaker lineup and among them is HashiCorp Developer Advocate, Nic Jackson. Nic will give a demo-driven talk entitled Managing Failure in a Distributed World.

In his talk, he will walk through the areas of complexity in a system and showcase the patterns that can help adopt a distributed architecture for applications with a focus on migrating from a monolithic architecture to a distributed one.

The reality is that such a machine does not exist, and to deliver the demands of performance and availability users demand from your systems, you have to be creative in partitioning workload over multiple machines, geographic regions, and failure domains. This approach introduces complexity to systems, so expect failure and design accordingly.

Join us

In addition to his conference talk, Nic will be speaking at a HashiCorp User Group Meetup before DevOps Barcelona kicks off on Tuesday 4 June.

Nic’s talk is titled Securing Cloud Native Communication, From End User to ServiceKey, and key takeaways include:
* An understanding of the "three pillars" of service mesh functionality: observability, reliability, and security. A service mesh is in an ideal place to enforce security features like mTLS.
* Learn how to ensure that there are no exploitable "gaps" within the end-to-end/user-to-service communication path.
* Explore the differences in ingress/mesh control planes, with brief demonstrations using Ambassador and Consul Connect.

Several HashiCorp engineers and community team members will be in attendance at the Meetup and the conference. We hope to connect with you there!

from Hashicorp Blog

Layer 7 Observability with Consul Service Mesh

Layer 7 Observability with Consul Service Mesh

This is the second post of the blog series highlighting new features in Consul service mesh.

Introduction

You’ve probably heard the term “observability” before, but what does it actually mean? Is it just monitoring re-branded, or is there more to observability than that?
We are publishing a series of blog posts to discuss the core use cases of service mesh. In this blog we will take a closer look at observability and how to enable the new L7 observability features of Consul Connect that are included in the recent Consul 1.5 release.

To get started, let’s revisit a familiar concept: monitoring.

Monitoring

Monitoring means instrumenting your applications and systems with internal or external tools to determine their state.

For example, you may have an external health check that probes an application’s state or determines its current resource consumption. You may also have internal statistics which report the performance of a particular block of code, or how long it takes to perform a certain database transaction.

Observability

Observability comes from the world of engineering and control theory. Control theory states that observability is itself a measure that describes “how well internal states of a system can be inferred from knowledge of external outputs”. In contrast to monitoring which is something you do, observability, is a property of a system. A system is observable if the external outputs, logging, metrics, tracing, health-checks, etc, allow you to understand its internal state.

Observability is especially important for modern, distributed applications with frequent releases. Compared to a monolithic architecture where components communicate through in-process calls, microservice architectures have more failures during service interactions because these calls happen over potentially unreliable networks. And with it becoming increasingly difficult to create realistic production-like environments for testing, it becomes more important to detect issues in production before customers do. A view into those service calls helps teams detect failures early, track them and engineer for resiliency.

With Modular and independently deployable (micro)services, visibility into these services can be hard to achieve. A single user request can flow through a number of services that are each independently developed and deployed by different teams. Since it’s impossible to predict every potential failure or problem that can occur in a system, you need to build systems that are easy to debug once deployed.Insight into the network is essential to understand the flow and performance of these highly distributed systems.

Service Mesh

A service mesh is a networking infrastructure that leverages “sidecar” proxies for microservice deployments. Since the sidecar proxy is present at every network hop, it captures both upstream and downstream communication.
Consequently, a service mesh provides complete visibility into the external performance of all the services.

One of the key benefits of adopting a service mesh is that the fleet of sidecar proxies have complete visibility of all service traffic and can expose metrics in a consistent way, irrespective of different programming languages and frameworks. Applications still need to be instrumented in order to gain insight into internal application performance.

Control Plane

A service mesh is traditionally built from two main components: the control plane and the data plane. The control plane provides policy and configuration for all of the running data planes in the mesh. The data plane is typically a local proxy which runs as a sidecar to your application. The data plane terminates all TLS connections and managed Authorisation for requests against the policy and service graph in the Control Plane. Consul forms the control plane of the service mesh, which simplifies the configuration of sidecar proxies for secure traffic communication and metrics collection.
Consul is built to support a variety of proxies as sidecars, and currently has documented, first class support for Envoy, chosen for its lightweight footprint and observability support.

Envoy sidecar proxy with its upstream services
Consul UI showing the Envoy sidecar proxy and its upstream services

Consul 1.5 introduced the ability to configure metrics collection for all of the Envoy proxies in Consul Connect at once, using the consul connect envoy command. During a new discovery phase, this command fetches a centrally stored proxy configuration from the local Consul agent, and uses its values to bootstrap the Envoy proxies.

When configuring the Envoy bootstrap through Consul Connect, there is support for a few different levels of customization. The higher level configuration is the simplest to configure and covers everything necessary to get metrics out of Envoy.

The centralized configuration can be created by creating a configuration file e.g.

“`ruby
kind = "proxy-defaults"
name = "global"

config {
# (dog)statsd listener on either UDP or Unix socket.
# envoystatsdurl = "udp://127.0.0.1:9125"
envoydogstatsdurl = "udp://127.0.0.1:9125"

# IP:port to expose the /metrics endpoint on for scraping.
# prometheusbindaddr = "0.0.0.0:9102"

# The flush interval in seconds.
envoystatsflush_interval = 10
}
“`

This configuration can be written to Consul using the consul config write <filename> command.

The config section in the above file enables metrics collection by telling Envoy where to send the metrics. Currently, Consul Connect supports the following metric output formats through centralized configuration:

  • StatsD: a network protocol that allow clients to report metrics, like counters and timers
  • DogStatsD: an extension of the StatsD protocol which supports histograms and the tagging of metrics
  • Prometheus: exposes an endpoint that Prometheus can scrape for metrics

The DogStatsD sink is preferred over statsd as it allows tagging of metrics which is essential to be able to filter them correctly in Grafana. The prometheus endpoint will be a good option for most users once Envoy 1.10 is supported and histograms are emitted.

Consul will use the configuration to generate the bootstrap configuration that Envoy needs to setup the proxy and configure the appropriate stats sinks. Once the Envoy proxy is bootstrapped it will start emitting metrics. You can capture these metrics in a timeseries store such as Prometheus and query them in a tool like Grafana
or send them to a managed monitoring solution. Below is an example of a Prometheus query you can write against the resulting metrics, which takes all the request times to the upstream "emojify-api" cluster and then groups them by quantile.

“`ruby

The response times of the emojify-api upstream,

categorized by quantile

sum(envoyclusterupstreamrqtime{envoyclustername="emojify-api"} > 0) by (quantile)
“`

Resulting graph showing the request time quantiles
Resulting graph showing the request time quantiles

Envoy emits a large number of statistics depending on how it is configured. In general there are three categories of statistics:

  • Downstream statistics related to incoming connections/requests.
  • Upstream statistics related to outgoing connections/requests.
  • Server statistics describing how the Envoy server instance is performing.

The statistics are formatted like envoy.<category>(.<subcategory>).metric and some of the categories that we are interested in are:

  • Cluster: a group of logically similar upstream hosts that Envoy connects to.
  • Listener: a named network location, like a port or unix socket, that can be connected to by downstream clients.
  • TCP: metrics such as connections, throughput, etc.
  • HTTP: metrics about HTTP and HTTP/2 connections and requests.

Grafana dashboard containing Envoy metrics
Grafana dashboard containing Envoy metrics

L7 Observability

By default Envoy proxies connections at L4 or the TCP layer. While that may be useful, it doesn't include important protocol specific information like request rates and response codes needed to indicate errors.

For example, with L4 you will see number of connections and bytes sent and received, but a failure is only going to be reported if a connection is terminated unexpectedly. When your APIs or websites are reporting failures they will generally respond with protocol-specific error messages while keeping the TCP connection alive or closing it gracefully. For example an HTTP service's response carries with it a status code which indicates the nature of the response. You will return a status 200 when a request is successful, a 404 if something is not found, and 5xx when the service has an unexpected error. Envoy can be configured to record which class each response's status falls into to allow monitoring error rates.

Another emerging protocol being used for communication between services is gRPC, which uses HTTP/2 for transport, and Protocol Buffers as an interface definition and serialisation format, to execute remote procedure calls. When configuring Envoy for GRPC, the metrics emitted will provide you with the functions called and the resulting statuses of those calls.

Monitoring these codes is essential to understanding your application, however, you need to enable some additional configuration in Envoy so that it understands that your app is talking L7.

You can specify the protocol of the service by setting the service defaults in a config file (see an example below).

ruby
kind: "service-defaults"
name: "emojify-api"
protocol = "http"

And then write it to the centralized configuration with the consul write <filename> command.

If the protocol is “http”, "http2" or “grpc”, it will cause the listener to emit L7 metrics. When bootstrapping the Envoy proxy, Consul will try to resolve the protocol for an upstream from the service it is referencing. If it is defined, there is no need to specify the protocol on the upstream.

Once the protocol fields of the proxy and upstreams are specified or discovered through Consul, Envoy will configure the clusters to emit additional L7 metrics, the HTTP category and HTTP/GRPC subcategories of metrics.

The emojify-cache and emojify-facebox clusters are emitting response codes with their metrics
The emojify-cache and emojify-facebox clusters are emitting response codes with their metrics

Once you get L7 metrics in Grafana, you can start to correlate events more precisely and see how failures in the system bubble up.

For example, if the emojify-api upstream starts to return 5xx response codes, you can look at the calls to the emojify-cache service and see if the Get calls are failing as well.

“`ruby

Number of requests to the emojify-upstream,

categorized by resulting response code

sum(increase(envoyclusterupstreamrqxx{envoyclustername="emojify-api"}[30s])) by (envoyresponsecode_class)

Number of retry attempts to the emojify-api upstream

sum(increase(envoyclusterupstreamrqretry{envoyclustername="emojify-api"}[30s]))
“`

Resulting graph showing the number of requests and retries
Resulting graph showing the number of requests and retries

“`ruby

Number of GRPC calls to the emojify-cache upstream,

categorized by function called

sum(increase(envoyclustergrpc0{envoyclustername="emojify-cache"}[30s])) by (envoygrpcbridgemethod)
“`

Resulting graph showing the GRPC functions and their call count
Resulting graph showing the GRPC functions and their call count

You can get much better observability over your systems by using distributed tracing. This requires some cooperation from applications to instigate the tracing and propagate trace context through service calls. The service mesh can be configured to integrate and add spans to traces to give insight into the time spent in the proxy. This can be provided through the envoy_tracing_json field, which accepts an Envoy tracing config in JSON format.

Summary

By using the centralized configuration you can configure metrics collection for all your services at the same time and in a central location. L7 metrics give you deeper insight into the behavior and performance of your services.

The L7 observability features described here were released in Consul 1.5. If you want to try out the new features yourself, this demo provides a guided, no-install playground to experiment. If you want to learn more about L7 observability for Consul Connect on Kubernetes, take a look at the HashiCorp Learn guide on the subject.

from Hashicorp Blog

Announcing Terraform 0.12

Announcing Terraform 0.12

We are very proud to announce the release of Terraform 0.12.

Terraform 0.12 is a major update that includes dozens of improvements and features spanning the breadth and depth of Terraform's functionality.

Some highlights of this release include:
* First-class expression syntax: express references and expressions directly rather than using string interpolation syntax.
* Generalized type system: use lists and maps more freely, and use resources as object values.
* Iteration constructs: transform and filter one collection into another collection, and generate nested configuration blocks from collections.
* Structural rendering of plans: plan output now looks more like configuration making it easier to understand.
* Context-rich error messages: error messages now include a highlighted snippet of configuration and often suggest exactly what needs to be changed to resolve them.

The full release changelog can be found here.

Here is an example of a Terraform configuration showing some new language features:

“`hcl
data "consulkeyprefix" "environment" {
path = "apps/example/env"
}

resource "awselasticbeanstalkenvironment" "example" {
name = "test
environment"
application = "testing"

setting {
namespace = "aws:autoscaling:asg"
name = "MinSize"
value = "1"
}

dynamic "setting" {
foreach = data.consulkey_prefix.environment.var
content {
namespace = "aws:elasticbeanstalk:application:environment"
name = setting.key
value = setting.value
}
}
}

output "environment" {
value = {
id = awselasticbeanstalkenvironment.example.id
vpc
settings = {
for s in awselasticbeanstalkenvironment.example.allsettings :
s.name => s.value
if s.namespace == "aws:ec2:vpc"
}
}
}
“`

Getting Started

We have many resources available for 0.12 for new and existing users. To learn more about the new functionality of 0.12 you can:

To get started using 0.12:

  • Download the Terraform 0.12 release.
  • If you are upgrading from a previous release, read the upgrade guide to learn about the required upgrade steps.

First-class Expression Syntax

Terraform uses expressions to propagate results from one resource into the configuration of another resource, and references within expressions create the dependency graph that Terraform uses to determine the order of operations during the apply step.

Prior versions of Terraform required all non-literal expressions to be included as interpolation sequences inside strings, such as "${azurerm_shared_image.image_definition_ubuntu.location}". Terraform 0.12 allows expressions to be used directly in any situation where a value is expected.

The following example shows syntax from prior Terraform versions:

“`hcl
variable "basenetworkcidr" {
default = "10.0.0.0/8"
}

resource "googlecomputenetwork" "example" {
name = "test-network"
autocreatesubnetworks = false
}

resource "googlecomputesubnetwork" "example" {
count = 4

name = "test-subnetwork"
ipcidrrange = "${cidrsubnet(var.basenetworkcidr, 4, count.index)}"
region = "us-central1"
network = "${googlecomputenetwork.custom-test.self_link}"
}
“`

In Terraform 0.12, the expressions can be given directly:

“`hcl
variable "basenetworkcidr" {
default = "10.0.0.0/8"
}

resource "googlecomputenetwork" "example" {
name = "test-network"
autocreatesubnetworks = false
}

resource "googlecomputesubnetwork" "example" {
count = 4

name = "test-subnetwork"
ipcidrrange = cidrsubnet(var.basenetworkcidr, 4, count.index)
region = "us-central1"
network = googlecomputenetwork.custom-test.self_link
}
“`

The difference is subtle in this simple example, but as expressions and configurations get more complex, this cleaner syntax will improve readability by focusing on what is important.

For more information on the Terraform 0.12 expression syntax, see Expressions.

Generalized Type System

Terraform was originally focused on working just with strings. Although better support for data structures such as lists and maps was introduced in subsequent versions, many of the initial language features did not work well with them, making data structures frustrating to use.

One case where this was particularly pronounced was when using module composition patterns, where objects created by one module would need to be passed to another module. If one module creates an AWS VPC and some subnets, and another module depends on those resources, we would previously need to pass all of the necessary attributes as separate output values and input variables:

“`hcl
module "network" {
source = "./modules/network"

basenetworkcidr = "10.0.0.0/8"
}

module "consul_cluster" {
source = "./modules/aws-consul-cluster"

vpcid = module.network.vpcid
vpccidrblock = module.network.vpccidrblock
subnetids = module.network.subnetids
}
“`

Terraform 0.12's generalized type system makes composition more convenient by giving more options for passing objects and other values between modules. For example, the "network" module could instead be written to return the whole VPC object and a list of subnet objects, allowing them to be passed as a whole:

“`hcl
module "network" {
source = "./modules/network"

basenetworkcidr = "10.0.0.0/8"
}

module "consul_cluster" {
source = "./modules/aws-consul-cluster"

vpc = module.network.vpc
subnets = module.network.subnets
}
“`

Alternatively, if two modules are more tightly coupled to one another, you might choose to just pass the whole source module itself:

“`hcl
module "network" {
source = "./modules/network"

basenetworkcidr = "10.0.0.0/8"
}

module "consul_cluster" {
source = "./modules/aws-consul-cluster"

network = module.network
}
“`

This capability relies on the ability to specify complex types for input variables in modules. For example, the "network" variable in the aws-consul-cluster module might be declared like this:

hcl
variable "network" {
type = object({
vpc = object({
id = string
cidr_block = string
})
subnets = set(object({
id = string
cidr_block = string
}))
})
}

For more information on the different types that can be used when passing values between modules and between resources, see Type Constraints.

Iteration Constructs

Another way in which data structures were inconvenient in prior versions was the lack of any general iteration constructs that could perform transformations on lists and maps.

Terraform 0.12 introduces a new for operator that allows building one collection from another by mapping and filtering input elements to output elements:

hcl
locals {
public_instances_by_az = {
for i in aws_instance.example : i.availability_zone => i...
if i.associate_public_ip_address
}
}

This feature allows us to adapt collection data returned in one format into another format that is more convenient to use elsewhere, such as turning a list into a map as in the example above. The output elements can be the result of any arbitrary Terraform expression, including another nested for expression!

Terraform 0.12 also introduces a mechanism for dynamically generating nested configuration blocks for resources. The dynamic "setting" block in the first example above illustrates that feature. Here is another example using an input variable to distribute Azure shared images over a specific set of regions:

“`hcl
variable "sourceimageregion" {
type = string
}

variable "targetimageregions" {
type = list(string)
}

resource "azurermsharedimageversion" "ubuntu" {
name = "1.0.1"
gallery
name = azurermsharedimagegallery.imagegallery.name
imagename = azurermsharedimage.imagedefinition.name
resourcegroupname = azurermresourcegroup.imagegallery.name
location = var.source
imagelocation
managed
imageid = data.azurermimage.ubuntu.id[count.index]

dynamic "targetregion" {
for
each = var.targetimageregions
content {
name = targetregion.value
regional
replica_count = 1
}
}
}
“`

For more information on these features, see for expressions and 'dynamic` blocks.

Structural Rendering of Plans

Prior versions of Terraform reduced plan output to a flat list of key, value pairs, even when using resource types with deeply-nested configuration blocks. This would tend to become difficult to read, particularly when making changes to nested blocks where it was hard to understand exactly what changed.

Terraform 0.12 has an entirely new plan renderer which integrates with Terraform's new type system to show changes in a form that resembles the configuration language, and which indicates nested structures by indentation:

“`
Terraform will perform the following actions:

# kubernetespod.example will be updated in-place
~ resource "kubernetes
pod" "example" {
id = "default/terraform-example"

    metadata {
        generation       = 0
        labels           = {
            "app" = "MyApp"
        }
        name             = "terraform-example"
        namespace        = "default"
        resource_version = "650"
        self_link        = "/api/v1/namespaces/default/pods/terraform-example"
        uid              = "5130ef35-7c09-11e9-be7c-080027f59de6"
    }

  ~ spec {
        active_deadline_seconds          = 0
        dns_policy                       = "ClusterFirst"
        host_ipc                         = false
        host_network                     = false
        host_pid                         = false
        node_name                        = "minikube"
        restart_policy                   = "Always"
        service_account_name             = "default"
        termination_grace_period_seconds = 30

      ~ container {
          ~ image                    = "nginx:1.7.9" -> "nginx:1.7.10"
            image_pull_policy        = "IfNotPresent"
            name                     = "example"
            stdin                    = false
            stdin_once               = false
            termination_message_path = "/dev/termination-log"
            tty                      = false

            resources {
            }
        }
    }
}

“`

Along with reflecting the natural configuration hierarchy in the plan output, Terraform will also show line-oriented diffs for multiline strings and will parse and show structural diffs for JSON strings, both of which have been big pain points for plan readability in prior versions.

Context-Rich Error Messages

Terraform 0.12 includes much improved error messages for configuration errors and for many other potential problems.

The error messages in prior versions were of varying quality, sometimes giving basic context about a problem but often lacking much context at all, and being generally inconsistent in terminology.

The new Terraform 0.12 error messages follow a predictable structure:

“`
Error: Unsupported Attribute

on example.tf line 12, in resource "awssecuritygroup" "example":
12: description = local.example.foo
|—————–
| local.example is "foo"

This value does not have any attributes.
“`

Not every error message will include all of these components, but the general makeup of a new-style error message is:

  • A short description of the problem type, to allow quick recognition of familiar problems.
  • A reference to a specific configuration construct that the problem relates to, along with a snippet of the relevant configuration.
  • The values of any references that appear in the expression being evaluated.
  • A more detailed description of the problem and, where possible, potential solutions to the problem.

Conclusion

The changes described above are just a few of the highlights of Terraform 0.12. For more details, please see the full changelog. This release also includes a number of code contributions from the community, and wouldn't have been possible without all of the great community feedback we've received over the years via GitHub issues and elsewhere. Thank you!

We're very excited to share Terraform 0.12 with the community and we will continue building out features and functionality. In addition, HashiCorp recently released Terraform Cloud Remote State Storage and have plans for adding more functionality to make using Terraform a great experience for teams. You can download Terraform 0.12 here and sign up for a Terraform Cloud account here.

from Hashicorp Blog

HashiCorp Consul supports Microsoft’s new Service Mesh Interface

HashiCorp Consul supports Microsoft’s new Service Mesh Interface

Today at KubeCon EU in Barcelona, Microsoft introduced a new specification, the Service Mesh Interface (SMI), for implementing service mesh providers into Kubernetes environments.

The Service Mesh Interface (SMI) is a specification for service meshes that run on Kubernetes. It defines a common standard that can be implemented by a variety of providers. This allows for both standardization for end-users and innovation by service mesh providers. SMI enables flexibility and interoperability.

We partnered with Microsoft to support the creation of this controller and this blog will explain how it can be used to set HashiCorp Consul Connect intentions within Kubernetes clusters.

What is SMI

Microsoft’s Service Mesh Interface is a series of Kubernetes controllers for implementing various service mesh capabilities. At launch, SMI will support four primary functions:
* Traffic Specs – define traffic routing on a per-protocol basis. These resources work in unison with access control and other types of policy to manage traffic at a protocol level.
* Traffic Access Control – configure access to specific pods and routes based on the identity of a client, to only allow specific users and services.
* Traffic Split – direct weighted traffic between services or versions of a service, enabling Canary Testing or Dark Launches.
* Traffic Metrics – expose common traffic metrics for use by tools such as dashboards and autoscalers.

At launch, HashiCorp Consul will support the Traffic Access Control specification, with possible integrations for the others in the future.

How Does Consul Support SMI

One of the custom resources defined by SMI is the TrafficTarget resource, developed by us in collaboration with the Microsoft team to assist with the challenge of securing service-to-service traffic. This resource enables the user to define Consul Connect intentions in a Kubernetes custom resource (CRD) and manage them through kubectl, Helm, or Terraform, rather than having to configure them directly through Consul. This enables developers to ensure that newly deployed applications have a secure connection to resources through a single workflow. Here is an example of how to configure this controller:

Assuming you have two services running in Kubernetes: a dashboard that shows the current value and a counting service that increases the count with each request. Both are configured to communicate via the Envoy sidecar proxy.

By default, Consul Connect denies all traffic through the service mesh. In order for traffic from the dashboard to be able to reach the backend service, you need to define an intention that allows traffic from the dashboard to the backend service.

You can create this intention using the TrafficTarget CRD below, store it as intention.yaml and apply it using kubectl apply -f intention.yaml.

“`

apiVersion: specs.smi-spec.io/v1alpha1
kind: TCPRoute
metadata:

name: counting-tcp-route

kind: TrafficTarget
apiVersion: access.smi-spec.io/v1alpha1
metadata:
name: counting-traffic-target
namespace: default
destination:
kind: ServiceAccount
name: counting
namespace: default
sources:
– kind: ServiceAccount
name: dashboard
namespace: default
specs:
– kind: TCPRoute
name: counting-tcp-route
“`

This will create an intention in Consul that allows traffic from the dashboard service to the counting service.

With this intention created, the dashboard will be able to show the current value retrieved from the counting backend.

Conclusion

Information about Microsoft’s SMI can be found on their home page and Github repo. By leveraging this new resource, users will find it easier to utilize Consul Connect’s capabilities inside of their Kubernetes environments. Outside of Kubernetes environments, Consul’s service mesh capabilities can also be extended to hybrid or multi-cloud environments, for more complex application deployments.

If you would like to experiment with SMI and Consul, we have built an experimental Kubernetes controller which can be found in the following repository: https://github.com/hashicorp/consul-smi-controller

For more information about this and other features of HashiCorp Consul, please visit: https://www.consul.io.

from Hashicorp Blog

Introducing Terraform Cloud Remote State Management

Introducing Terraform Cloud Remote State Management

After announcing our plans to bring HashiCorp Terraform collaboration features to everyone last fall, we’re excited to introduce Terraform Cloud, a collaboration platform designed for all Terraform users. Terraform Cloud is a SaaS application that brings free collaboration features to individual users and teams with additional paid feature sets that provide team management, self-service infrastructure, and governance features.

After Beta testing the past few months, we’re excited to make the Remote State Management feature of Terraform Cloud generally available today. Other features to be made available later this year. Sign up for a Terraform Cloud account here!

Remote State Management

State files in Terraform capture the existing state of provisioned infrastructure for a given workspace. These files are used by Terraform to ensure that it properly creates or destroys infrastructure with respect to infrastructure that already exists.

For existing users of Terraform Open Source, state files are stored on machines locally by default. That means collaborating on Infrastructure with a team member involves carefully sharing that file and ensuring that team members update each other when they’ve made changes. That motion gets more complex as both number of state files/workspaces and number of collaborators grow.

Today we’re introducing remote state management with Terraform Cloud. This feature allows users to use Terraform Cloud as a remote backend to store and manage state files. With Terraform Cloud remote state management, individual users no longer need to maintain local state files and teams no longer need to carefully share or manage those files. Terraform Cloud will automatically handle everything.

Summary

We’re very excited to bring more features and functionality to our users. Remote State management is just the beginning for Terraform Cloud. If you’re interested in getting started or want to be informed of future updates, sign up for Terraform Cloud here.

from Hashicorp Blog

HashiCorp Consul 1.5

HashiCorp Consul 1.5

We are excited to announce the release of Consul 1.5.0. Consul is a multi-cloud service networking platform to connect, secure and configure services across any runtime platform and public or private cloud.

Consul 1.5.0 introduced several major new features and a number of improvements and bug fixes.

Consul Connect Roadmap

We've published an outline for a blog post series that will provide information about the Connect (Consul service mesh) roadmap. This release is the first in a series of major releases that will provide the functionality outlined in that post.

Specifically, Connect now supports L7 observability and load balancing which can be used today via the Envoy integration. Read more in the Connect Envoy documentation. If you'd like to try this today with Kubernetes, follow our L7 Observability on Kubernetes guide. Stay tuned for the our L7 observability blog to learn more about the use case.

Additionally, we've added a number of new features and improvements:
* Expressive Filtering support across HTTP APIs. When using the HTTP API a query parameter can be used to pass a filter expression to Consul on a range of endpoints. Learn more about options for filtering in the API documentation.

  • Centralized Configuration. Enables central configuration of some service and proxy defaults. For more information see the Configuration Entries docs. This is a set of new endpoints and CLI commands for centrally managing the configuration of many upcoming L7 features of Consul Connect.

  • ACL enhancements. Expiration times, roles, service identity mappings and auth methods are all improvements to the ACL system to enable services to obtain a valid Consul ACL token using 3rd party identity. The first available auth method included in this release is Kubernetes Service Account Tokens.

  • consul-k8s and consul-helm. The Kubernetes integrations have been updated to support L7 observability and usage of the new auth method ACL functionality, as well as general support for ACLs. For a full list of changes, visit the changelog for consul-k8s or consul-helm. Note that releases here will be available shortly after the Consul 1.5.0 release.

  • UI Improvements. The UI now supports live updates (opt-in for now), a better search interface for the services page, visibility of Connect proxies, additional ACL features, and a number of bug fixes.

  • Azure Snapshot Agent support (Enterprise). The enterprise snapshot agent can now store snapshots in Azure storage.

Envoy CVE-2019-9900 and CVE-2019-9901

Envoy versions lower than 1.9.1 are vulnerable to CVE-2019-9900 and CVE-2019-9901. Both are related to HTTP request parsing and so only affect Consul Connect users if they have configured HTTP routing rules via the "escape hatch". Note that while we officially deprecate support for the older version of Envoy in 1.5.0, we recommend using Envoy 1.9.1 with all previous versions of Consul Connect too (back to 1.3.0 where Envoy support was introduced).

Removal of Legacy UI

We've removed the legacy UI completely from Consul, so it is no longer available when using the CONSULUILEGACY environment variable.

Consul Guides moved to HashiCorp Learn

We have completed the guide transition from consul.io to learn.hashicorp.com/consul. During the transition process, all the guides were updated to be easier to use. HashiCorp Learn is a learning platform for all types of users at every experience level. The Learn guides are step-by-step walk throughs aimed to help you complete specific tasks. Recently, we've also added new content to the Learn platform including a track to help you get started with Kubernetes, production guidance for configuring ACLs, and a streamlined troubleshooting guide.

Conclusion

Please review the v1.5.0 changelog for a detailed list of changes. The release binaries can be downloaded here.

Thank you to our active community members who have been invaluable in adding new features, reporting bugs, and improving the documentation for Consul in this release!

from Hashicorp Blog