Cisco Tech Blog CiscoTech Blog Close
Home Contact

OpenTelemetry is the observability solution supported by CNCF. It follows the same standards as OpenCensus and OpenTracing, meaning you can more easily prevent vendor lock-in since it decouples your application instrumentation and data export. With OpenTelemetry, you can achieve full application-level observability from its SDKs, agents, libraries, and standards.

For more information about OpenTelemetry, please refer to our posts “Introduction to OpenTelemetry,” and ”OpenTelemetry Best Practices.” Here in this post, we’ll guide you through setting it up, implementing traces with OpenTelemetry, and using Grafana, Prometheus, and Jaeger; we’ll then introduce you to AWS Distro.

How to Set Up OpenTelemetry 🔗︎

To get started with OpenTelemetry, you need to meet the following prerequisites:

  • Have a Kubernetes cluster up and running, or use the following link to set up your Kubernetes cluster using kubeadm. A sample setup should have one master node (control plane) and two worker nodes. To list all the nodes in your Kubernetes cluster run the command below:
kubectl get nodes             

NAME                          STATUS   ROLES    AGE   VERSION

istio-cluster-control-plane   Ready    master   16m   v1.19.1

istio-cluster-worker          Ready    <none>   16m   v1.19.1

istio-cluster-worker2         Ready    <none>   16m   v1.19.1
  • Istio should also be up and running; if not, click here for instructions. Once the Istio cluster is set up, use the following command to check the status of all resources:
kubectl get all -n istio-system

NAME                                        READY   STATUS    RESTARTS   AGE

pod/istio-egressgateway-c9c55457b-zzf55     1/1     Running   0          15m

pod/istio-ingressgateway-865d46c7f5-ddpnk   1/1     Running   0          15m

pod/istiod-7f785478df-2c6rx                 1/1     Running   0          16m

NAME                           TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)                                                                      AGE

service/istio-egressgateway    ClusterIP   <none>         80/TCP,443/TCP,15443/TCP                                                     15m

service/istio-ingressgateway   LoadBalancer   15021:32028/TCP,80:31341/TCP,443:31306/TCP,31400:30297/TCP,15443:32577/TCP   15m

service/istiod                 ClusterIP    <none>         15010/TCP,15012/TCP,443/TCP,15014/TCP                                        16m

NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE

deployment.apps/istio-egressgateway    1/1     1            1           15m

deployment.apps/istio-ingressgateway   1/1     1            1           15m

deployment.apps/istiod                 1/1     1            1           16m

NAME                                              DESIRED   CURRENT   READY   AGE

replicaset.apps/istio-egressgateway-c9c55457b     1         1         1       15m

replicaset.apps/istio-ingressgateway-865d46c7f5   1         1         1       15m

replicaset.apps/istiod-7f785478df                 1         1         1       16m
  • The sample Bookinfo application should already be deployed. To verify the status of the application, please run the command below; it will list all the resources deployed for the Bookinfo application:
kubectl get all                

NAME                                  READY   STATUS    RESTARTS   AGE

pod/details-v1-79f774bdb9-7zzwm       2/2     Running   0          14m

pod/productpage-v1-6b746f74dc-z7z8m   2/2     Running   0          14m

pod/ratings-v1-b6994bb9-bt9tb         2/2     Running   0          14m

pod/reviews-v1-545db77b95-kbjbg       2/2     Running   0          14m

pod/reviews-v2-7bf8c9648f-ddw5d       2/2     Running   0          14m

pod/reviews-v3-84779c7bbc-27vz6       2/2     Running   0          14m

NAME                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE

service/details       ClusterIP      <none>        9080/TCP   14m

service/kubernetes    ClusterIP       <none>        443/TCP    23m

service/productpage   ClusterIP     <none>        9080/TCP   14m

service/ratings       ClusterIP   <none>        9080/TCP   14m

service/reviews       ClusterIP      <none>        9080/TCP   14m

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE

deployment.apps/details-v1       1/1     1            1           14m

deployment.apps/productpage-v1   1/1     1            1           14m

deployment.apps/ratings-v1       1/1     1            1           14m

deployment.apps/reviews-v1       1/1     1            1           14m

deployment.apps/reviews-v2       1/1     1            1           14m

deployment.apps/reviews-v3       1/1     1            1           14m

NAME                                        DESIRED   CURRENT   READY   AGE

replicaset.apps/details-v1-79f774bdb9       1         1         1       14m

replicaset.apps/productpage-v1-6b746f74dc   1         1         1       14m

replicaset.apps/ratings-v1-b6994bb9         1         1         1       14m

replicaset.apps/reviews-v1-545db77b95       1         1         1       14m

replicaset.apps/reviews-v2-7bf8c9648f       1         1         1       14m

replicaset.apps/reviews-v3-84779c7bbc       1         1         1       14m

Now, verify the application from the browser by going to http://<ingress gateway external ip>/productpage.


Figure 1: Sample BookInfo application

Your setup is now ready. In the next section, we’ll explore how to send these traces to Grafana and Jaeger.

Sending Traces with OpenTelemetry 🔗︎

In the last section, you got Istio up and running, but Istio can integrate with a bunch of other telemetry applications to provide additional functionality.

Prometheus, Grafana, and Jaeger are three such applications. Let’s explore each of these, one by one:

Prometheus 🔗︎

Prometheus is an open-source monitoring system that provides a time-series database for metrics. Using Prometheus, you can record metrics, track the health of your application within a service mesh, then use Grafana to visualize those metrics.

Istio provides a sample add-on to deploy Prometheus, so proceed to the directory where you have Istio downloaded: cd istio-1.9.0

Deploy Prometheus by using the following command; the output follows:

kubectl apply -f samples/addons/prometheus.yaml 

serviceaccount/prometheus created

configmap/prometheus created created created

service/prometheus created

deployment.apps/prometheus created

Now, verify your Prometheus setup:

kubectl get all -n istio-system -l app=prometheus                        

NAME                              READY   STATUS    RESTARTS   AGE

pod/prometheus-7bfddb8dbf-xqvdk   2/2     Running   0          2m28s

NAME                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE

service/prometheus   ClusterIP   <none>        9090/TCP   2m28s

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE

deployment.apps/prometheus   1/1     1            1           2m28s

NAME                                    DESIRED   CURRENT   READY   AGE

replicaset.apps/prometheus-7bfddb8dbf   1         1         1       2m28s

Next, access the Prometheus dashboard:

istioctl dashboard prometheus&        

[1] 1034524

prometheus dashboard

Figure 2: Prometheus dashboard

Now that you have Prometheus installed, you can visualize the metrics it collects by installing Grafana.

Grafana 🔗︎

Grafana is an open-source monitoring solution that, when integrated with a time-series database like Prometheus, creates a custom dashboard and gives meaningful insights into your metrics. Using Grafana, you can monitor the health of your application with a service mesh.

Similar to Prometheus, Istio provides a sample add-on you can use to deploy Grafana. Simply go to the directory where you have Istio downloaded: cd istio-1.9.0.

Deploy Grafana via the following command; again, this is followed by the output code:

kubectl apply -f samples/addons/grafana.yaml     

serviceaccount/grafana created

configmap/grafana created

service/grafana created

deployment.apps/grafana created

configmap/istio-grafana-dashboards created

configmap/istio-services-grafana-dashboards created

Go ahead and verify your Grafana setup:

kubectl get all -n istio-system -l           

NAME                           READY   STATUS    RESTARTS   AGE

pod/grafana-784c89f4cf-mxssg   1/1     Running   0          2m36s

NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE

service/grafana   ClusterIP   <none>        3000/TCP   2m36s

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE

deployment.apps/grafana   1/1     1            1           2m36s

NAME                                 DESIRED   CURRENT   READY   AGE

replicaset.apps/grafana-784c89f4cf   1         1         1       2m36s

And access the Grafana dashboard:

istioctl dashboard grafana&   

grafana dashboard

Figure 3: Grafana dashboard

Once you have Grafana up and running, you will see that Grafana bundled up some pre-configured Istio dashboards.

preconfigured grafana dashboard

Figure 4: Preconfigured Grafana dashboard

If you click on “Istio Service Dashboard,” you will see a number of metrics. As there is no activity on this server, all the metrics show either a 0 or N/A.

grafana service dashboard

Figure 5: Grafana service dashboard

Let’s try to generate some load to your cluster by running this script, which will access the sample app application page every second (an infinite while loop).

while :; do; curl -s -o /dev/null;done

If you go back to your Grafana dashboard, you’ll start seeing the loads you’ve generated and different metrics.

grafana service dashboard

Figure 6: Grafana service dashboard

With Grafana up and running, let’s move on to tracing using Jaeger.

Jaeger 🔗︎

Jaeger is a distributed tracing system that is open source and uses the OpenTracing specification. It allows users to troubleshoot and monitor transactions in complex distributed systems.

Istio also provides a sample add-on to deploy Jaeger, just like with Prometheus and Grafana.

So, go to the directory where you have Istio downloaded: cd istio-1.9.0

And deploy Jaeger by using the following command; output follows:

kubectl apply -f samples/addons/jaeger.yaml                          

deployment.apps/jaeger created

service/tracing created

service/zipkin created

service/jaeger-collector created

Run the following command to verify your Jaeger setup:

kubectl get all -n istio-system  -l app=jaeger                       

NAME                          READY   STATUS    RESTARTS   AGE

pod/jaeger-7f78b6fb65-4n6dd   1/1     Running   0          2m10s

NAME                       TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)               AGE

service/jaeger-collector   ClusterIP   <none>        14268/TCP,14250/TCP   2m9s

service/tracing            ClusterIP   <none>        80/TCP                2m10s

NAME                     READY   UP-TO-DATE   AVAILABLE   AGE

deployment.apps/jaeger   1/1     1            1           2m10s

NAME                                DESIRED   CURRENT   READY   AGE

replicaset.apps/jaeger-7f78b6fb65   1         1         1       2m10s

Now, access the Jaeger dashboard:

istioctl dashboard jaeger& 


jaeger dashboard

Figure 7: Jaeger dashboard

Jaeger is now running in the background and collecting data, so select “productpage.default” and click on “Find Traces” at the bottom of the drop-down menu.

jaeger dashboard with productpage

Figure 8: Jaeger dashboard with productpage

The top visualization shows you the average response time of an end-to-end response for different periods.

Jaeger Dashboard visualization for average response time

Figure 9: Jaeger Dashboard visualization for average response time

Now that you understand how to send metrics to Grafana and Jaeger, let’s shift gears and look at AWS Distro for OpenTelemetry (ADOT).

AWS and OpenTelemetry 🔗︎

AWS now offers AWS Distro for OpenTelemetry (still in preview phase). AWS is one of the upstream contributors to the OpenTelemetry project and tests, secures, optimizes, and supports various components of the project like SDKs, agents, collectors, and auto-instrumentations. The initial release supports languages including Python, Go, Java, and JavaScript; other languages will be included in upcoming releases. On top of that, you don’t have to pay to use AWS Distro for OpenTelemetry—just for the traces, logs, and metrics sent to AWS.

Using AWS Distro, you need to instrument your application only once and can then send correlated metrics and traces to multiple monitoring solutions, such as CloudWatch, X-Ray, Elasticsearch, and Partner solutions. With the help of auto-instrumentation agents, you can collect traces without needing to change your code, plus the solution gathers metadata from your AWS resources, which helps correlate application performance data with the underlying infrastructure data, resolving problems faster.

Currently, AWS Distro for OpenTelemetry supports instrumenting your application for on-premises as well as the following AWS services: Amazon Elastic Kubernetes Service (EKS) on EC2, AWS Fargate, Elastic Compute Cloud (EC2), and AWS Fargate.

Various Components 🔗︎

AWS Distro for OpenTelemetry consists of the following components:

  • The OpenTelemetry SDK allows for the collection of metadata for AWS-specific resources, such as Task and Pod ID, Container ID, and Lambda function version. It can also correlate trace and metrics data from both CloudWatch and AWS X-Ray.
  • The OpenTelemetry Collector is responsible for sending data to AWS services like AWS CloudWatch, Amazon Managed Service for Prometheus, and AWS X-Ray.

AWS also supports an OpenTelemetry Java auto-instrumentation agent for tracing data from AWS SDKs and X-Ray. For all these components, AWS also contributes back to the upstream project.

Serverless and OpenTelemetry 🔗︎

AWS Distro for OpenTelemetry currently only supports Python based on Lambda Extensions. First, you need to build your Lambda layer containing the OpenTelemetry SDK and Collector, which you can then add to your Lambda function. Once this is done, AWS takes care of auto-instrumentation and initializes the instrumentation of dependencies, HTTP clients, and AWS SDKs. It also captures resource-specific information, such as Lambda function name, Amazon resource name (ARN), version, and request-ID.

Requirements 🔗︎

There are a couple of installations required before building the Lambda layer:

  • AWS SAM CLI: Refer to the following doc to install per your given platform.
  • AWS CLI: Refer to the following doc to install per your given platform; this is needed to configure AWS credentials and requires administrator access.

Note: Currently Lambda layer only supports Python 3.8 Lambda runtimes.

Building the Lambda Layer 🔗︎

Once you meet all the prerequisites, the next step is to build the Lambda layer. Here, you’ll have the AWS Distro for OpenTelemetry Collector (ADOT Collector), run as a Lambda extension; your Python function will also use this layer.

For this example, you’ll use the aws-otel-lambda repository.

First, clone the repo:

git clone

Then go to the sample-apps directory:

cd sample-apps/python-lambda

To Publish the layer, run the command below:

Invoked with: 
sam building...
  SAM CLI now collects telemetry to better understand customer needs.
  You can OPT OUT and disable telemetry collection by setting the
  environment variable SAM_CLI_TELEMETRY=0 in your shell.
  Thanks for your help!
--------------------------Output Cut -------------------------------
Successfully created/updated stack - adot-py38-sample in us-west-2
ADOT Python3.8 Lambda layer ARN:

If you want to publish the layer in a different region, e.g., to us-east-2, run the command with the -r parameter:

./ -r us-east-2

Auto-Instrumentation for Your Lambda Function 🔗︎

Once you push the Lambda layer, you need to follow a series of steps to enable auto-instrumentation.

First, go to the Lambda console and select the function you want to instrument. Scroll down and click on “Add a layer.”

Lambda console for adding a layer

Figure 10: Lambda console for adding a layer

Select “Custom layers,” and from the drop-down, choose the layer you created earlier and Version 1. Click on “Add.”

Lambda console for adding a custom layer

Figure 11: Lambda console for adding a custom layer

Now, go back to your Lambda function and click on “Configuration,” then “Environment variables.” Select “Edit” and “Add environment variable.”

Lambda console for adding an environment variable

Figure 12: Lambda console for adding an environment variable

Add AWS_LAMBDA_EXEC_WRAPPER with value /opt/python/adot-instrument. This will enable auto-instrumentation. Click on “Save.”

Lambda console for adding environment variable AWS_LAMBDA_EXEC_WRAPPER

Figure 13: Lambda console for adding environment variable AWS_LAMBDA_EXEC_WRAPPER

Also, make sure that “Active tracing” is enabled under “Monitoring and operations tools.”

Lambda console for enabling active tracing

Figure 14: Lambda console for enabling active tracing

By default, AWS Distro for OpenTelemetry exports telemetry data to AWS X-Ray and CloudWatch. For the latter, go to the CloudWatch console and click on “Traces.”

CloudWatch dashboard with traces

Figure 15: CloudWatch dashboard with traces

To retrieve information about specific traces, click any of the Lambda functions and then trace ID.

CloudWatch dashboard with specific trace

Figure 16: CloudWatch dashboard with specific trace

And to drill down even further, go to the X-Ray console and click on “Analytics.”

AWS X-Ray with specific trace

Figure 17: AWS X-Ray with specific trace

Wrapping Up 🔗︎

OpenTelemetry is still an evolving project, and with the launch of products like AWS Distro for OpenTelemetry, fully backed by AWS, it’s heading toward stability. Currently, AWS Distro for OpenTelemetry only supports Python for Lambda, but other languages (Node.js, Java, Go, .NET) will be coming soon. Also, you need to create your own Lambda layer manually in the current state, but in the future, AWS will automate and manage this process.

Epsagon is tightly integrated with AWS and provides full visibility into how your serverless application is performing. Onboarding your new or existing application is straightforward and doesn’t require any complex configuration. It also provides a visualization dashboard that helps detect bottlenecks and overall system health, predicts the overall cost, and offers other helpful insights based on collected data and metrics. Another advantage is that Epsagon correlates all the aggregated data, which is vital in distributed architectures using AWS Lambda and other serverless services. Plus, Epsagon includes auto-instrumentation for languages like Python, Go, Java, Ruby, Node.js, PHP, and .NET, reducing the time it takes to instrument tracing.

Check out our demo environment or try for FREE for up to 10 Million traces per month!