Outshift Logo

PRODUCT

14 min read

Blog thumbnail
Published on 12/07/2019
Last updated on 03/21/2024

Reducing Istio proxy resource consumption with outbound traffic restrictions

Share

When someone first hears about Istio's sidecar concept, the first two questions are usually about its effect on resource consumption and request latency or throughput. In this post we'll discuss the first of those questions: resource consumption of Istio proxies and how it can be reduced with the new Sidecar custom resource. Previously we had a Kafka on Istio benchmark post that was about throughput, and we'll take a deeper look at the most important HTTP request metrics in a forthcoming article (stay tuned). While the first part of this blog focuses on what is a Sidecar and the (benchmarked) benefits, the second is about how to setup the Sidecar configs properly in a cluster. We'll also highlight how Backyards (now Cisco Service Mesh Manager) - our automated and operationalized service mesh built on Istio - makes it easy to configure Sidecars, either through its UI or CLI using it's automatic proxy configuration feature.
Want to know more? Get in touch with us, or delve into the details of the latest release. Or just take a look at some of the Istio features that Backyards automates and simplifies for you, and which we've already blogged about.

The Sidecar custom resource

Istio is famous for having a huge number of custom resources for configuration. Among a few other, this is one of the things that makes the project incredibly complex. Maintainers are aware of this, and the last few releases made some steps towards simplifying Istio. During this process, the number of custom resources dropped from around 50 to around 25. But still, beside the most commonly used ones - like VirtualService or DestinationRule - there are still a few that are quite unknown, like the Sidecar custom resource. The Istio documentation says the following about it:
Sidecar describes the configuration of the sidecar proxy that mediates inbound and outbound communication to the workload instance it is attached to. By default, Istio will program all sidecar proxies in the mesh with the necessary configuration required to reach every workload instance in the mesh, as well as accept traffic on all the ports associated with the workload. The Sidecar configuration provides a way to fine tune the set of ports, protocols that the proxy will accept when forwarding traffic to and from the workload. In addition, it is possible to restrict the set of services that the proxy can reach when forwarding outbound traffic from workload instances.
Without any use of this custom resource, Pilot automatically sends the same configuration to the proxies, that contains information about every workload instance in the mesh. The Sidecar resource can be used to fine-tune the Envoy config for a set of workloads. This seems like a very advanced use-case, but - although it isn't highlighted - the last sentence of the definition makes a huge difference. It is about restricting the sidecar to reach services other than the ones configured here. This feature is quite easy to configure, and has two large benefits:
  1. Security: an Istio administrator can restrict the set of services that can be reached from a specific namespace or workload, so it can be made sure that these workloads won't be able to access anything outside their scope.
  2. Performance: by default, all Envoy proxies in the cluster receive the same configuration from Pilot through xDS, that contains all pods in the cluster. This configuration can be huge, especially in a large cluster, while in most cases a simple workload only communicates with a handful of others. Changing this configuration to contain only a necessary set of services can have a large impact on the memory footprint of an Istio proxy.
As this is a benchmarking post, we'll pay attention to the performance benefits now.

Benchmarking goal & setup

We wanted to see if configuring Sidecars could bring a real resource utilization improvement to an average cluster. For the benchmark, we've created the following Amazon PKE cluster with Pipeline:
Cloud Provider Distribution Master node Worker nodes
Amazon PKE - single master 1 x c5.xlarge 10 x c5.xlarge (spot)
On this 10 node cluster, we've used the Backyards CLI to install the same demoapp in 10 different namespaces.
The demoapp is a sample microservice application that's part of Backyards. It consists of 7 different microservices, and can be used to easily test and benchmark Istio features.
Using the CLI, installing this app is as easy as running this command for every different namespace:
backyards demoapp --demo-namespace backyards-demo-N install
Using horizontal pod autoscalers, each demoapp was configured to have at most 15 pods, so in total there were 70 different services and 150 pods in the benchmark cluster. Because each demoapp communicates only internally within the namespace (and to istio-telemetry), it means that using namespace isolation with Sidecar custom resources, the configuration that's sent to each Istio proxy contains only ~10% of the original configuration entries. It could be made even smaller by isolating workloads with only their specific service targets, but we felt that namespace isolation would be enough for the benchmark. First, we've measured the idle resource consumption of the sidecars with and without isolation, then did a load test to see how things change. The load was generated using Bombardier, a simple HTTP benchmarking tool, written in Go. It was started in a pod in the backyards-system namespace. During a 30 minutes load test, the cluster was handling around 2,5K requests per second.

Prometheus and Grafana dashboards were configured through Pipeline's integrated monitoring feature, that made the setup a "few-clicks process". If you are interested to read more about Pipeline's integrated cluster services, follow this link.

Benchmark Results

Here comes the interesting part. Let's see the most important Grafana dashboards first, about the sidecar memory usage. Memory consumption per Istio sidecar: Istio sidecar Total memory consumption of all Istio sidecars in the cluster: all Istio sidecars The above two dashboards show basically the same memory consumption metric, but the first one is per sidecar, while the second one is the total memory usage of all Istio proxies in the cluster. The results almost speak for themselves, but let's summarize what we see above.
Avg. mem. usage of sidecars (total) Avg. mem. usage / sidecar
150 pods, no load, no isolation 8.51GB 54.6MB
150 pods, no load, w/ isolation 5.22GB 34.7MB
150 pods, w/ load, no isolation 10.38GB 61.0MB
150 pods, w/ load, w/ isolation 7.09GB 42.1MB
While the cluster is in an idle state, Envoy proxies consume ~40% less memory when namespace isolation is configured. The difference in the percentage shrinks a bit when putting the cluster under load, bit it's still more than 30%. And this is only a 10 node cluster with ~150 pods, and outbound traffic is restricted to the namespace! In production clusters pod count could easily go into the hundreds, or even thousands, and these restrictions can further be tightened to the workload level. So with a proper setup, you can easily save about ~40-50% of the sidecar proxy memory consumption. From another aspect, the total savings are ~3GBs of memory. It is almost half a node from a memory perspective (a c5.xlarge VM has 8GBs), so in reality it could mean serious cost savings in a larger cluster. Notes:
  • Grafana labelling on the Y-axis is a bit misleading on the second dashboard (notice that 7GB is there 2 times) It is probably because of some Grafana display misconfiguration, but it was too late when we've noticed.
  • There are some spikes in the second graph, when transitioning from one stage to another. It happens because after a stage was finished, pods were deleted to release the currently reserved memory, and new pods were created instead. During this transition, some of the pods were still terminating and holding memory, while new ones were coming up, so the number of sidecars was above 150 for a short period. The per-sidecar graph show that there were no spikes in the individual proxy resource utilization.

Additional benchmarks

Restricting outbound traffic on the sidecar level doesn't affect proxy CPU consumption. The below graph shows that during idle periods CPU usage was close to 0, and under load it is steady around 3 vCPU cores regardless of Sidecar configuration. Additional benchmarks We've heard a lot of complaints about Mixer telemetry. It's not a coincidence that Istio is moving towards Mixerless telemetry. We've already added support for this setup in the Istio operator, and Backyards will also be changed in the near future to configure that as default. Mixer telemetry was enabled during load testing, and we can only confirm that it takes up a huge amount of resources in the cluster. The Mixer telemetry component holds about 8GBs of virtual memory and consumes 4 vCPUs in the cluster. And it's still only ~2.5K requests per sec. Mixer telemetry Pilot's resource consumption is minimal compared to the sidecars and Mixer. Pilot's resource consumption is minimal compared to the sidecars and Mixer.

Restricting outbound traffic

From the above benchmark results, it's quite clear that having proper outbound traffic configuration on the proxy-level really does worth it. Configuration can be done manually through Istio custom resources, or using Backyards (now Cisco Service Mesh Manager). Backyards (now Cisco Service Mesh Manager) has a handy feature that can be used to automatically restrict outbound traffic for specific workload, or namespaces. Let's go through them one by one.

With Istio YAML

The Sidecar custom resource can be used to fine-tune each Envoy proxy's configuration in the cluster. Applying these ingress and egress rules could fill a whole new blog post, so let's stick to the outbound traffic restrictions. The simplest Sidecar config with outbound restrictions looks like this:
kind: Sidecar
metadata:
  name: default
  namespace: backyards-demo
spec:
  egress:
  - hosts:
    - ./*
    - istio-system/*
There's no workload label selector in this config, so it will be applied to all proxies in the namespace. The hosts section restricts outbound traffic to all services in the current and istio-system namespaces. To configure sidecars only for specific workloads, we must add a workload selector:
kind: Sidecar
metadata:
  name: default
  namespace: backyards-demo
spec:
  workloadSelector:
    labels:
      app: payments
      version: v1
  egress:
  - hosts:
    - ./notifications.backyards-demo.svc.cluster.local
    - istio-system/*
If a namespace and a workload level Sidecar resource is also present, preference will be given to the resource with a workload selector. We've also changed the hosts section: instead of allowing outbound traffic to every service in the namespace, it allows traffic to the notifications service only. To learn more about the Sidecar resource, read the reference in the Istio docs.

With the backyards-cli

To see how easy it is to get started with Backyards, check out the docs.
These examples work out of the box with the demo application packaged with Backyards (now Cisco Service Mesh Manager). Change the service name and namespace to match your service. To see the currently applied sidecar config for a particular service, use the sidecar-proxy egress get command. First, let's check if there are any egress rules set for the backyards-demo namespace. The command has a workload switch that accepts a workload in namespace/name format. If name is *, namespace level rules are listed.
> backyards sidecar-proxy egress get --workload backyards-demo/*

Sidecar egress rules for backyards-demo/*

Sidecar                 Hosts                   Bind  Port  Capture Mode
backyards-demo/default  [./* istio-system/*]          -
There is one Sidecar resource, called default and an egress isolation rule already present in the namespace. This is because the default installation of the demoapp already created it to advertise best practices. Let's check the results for a specific workload (analytics-v1) as well:
> backyards sidecar-proxy egress get --workload backyards-demo/analytics-v1

Sidecar egress rules for backyards-demo/*

Sidecar                 Hosts                   Bind  Port  Capture Mode
backyards-demo/default  [./* istio-system/*]          -
The result is the same as before. If a namespace and a workload level Sidecar resource is also present, preference will be given to the resource with a workload selector. For now, there's no workload level selector, so Backyards will show the currently applied namespace level resource. Let's create a new outbound traffic rule with egress set. It will restrict traffic for the analytics-v1 workload to the istio-system namespace only, because this workload doesn't have any other outbound connections.
> backyards sp egress set --workload backyards-demo/analytics-v1 --hosts "istio-system/*"

INFO[0002] sidecar egress for backyards-demo/analytics-v1 set successfully

Sidecar egress rules for backyards-demo/analytics-v1

Sidecar                              Hosts             Bind  Port  Capture Mode
backyards-demo/backyards-demo-yvsj2  [istio-system/*]        -
Now if you check the analytics-v1 workload again, it will show the newly created, workload-level config.
> backyards sidecar-proxy egress get --workload backyards-demo/analytics-v1

Sidecar egress rules for backyards-demo/analytics-v1

Sidecar                              Hosts             Bind  Port  Capture Mode
backyards-demo/backyards-demo-yvsj2  [istio-system/*]        -
To cleanup and delete the egress rules, use these commands:
> backyards sidecar-proxy egress delete backyards-demo/analytics-v1

> backyards sidecar-proxy egress delete backyards-demo/*
Tip: all CLI commands and switches have short names, check the CLI docs to get to know them.

With the Backyards UI

Sidecars can also be configured from the Backyards dashboard. To open the Backyards dashboard, set your KUBECONFIG and run backyards dashboard from the CLI. With the Backyards UI You can also edit or delete rules, and you can also view the full YAML description of the sidecars. full YAML description of the sidecars.

With automatic proxy configuration

Backyards already has information about how the workload instances are communicating in the cluster, and uses it to build topology information. Because this information is already available, it felt like a natural move to implement a feature that can recommend outbound proxy configuration automatically. Backyards is able to collect target services for specific workloads from a previous timeframe, and it lists these as recommendations for a given workload, or namespace using a simple CLI command.
> backyards sidecar-proxy egress recommend --workload backyards-demo/payments-v1

Recommended egress rules for backyards-demo/payments-v1

Hosts                                                                                                            Bind  Port  Capture Mode
[./notifications.backyards-demo.svc.cluster.local istio-system/istio-telemetry.istio-system.svc.cluster.local]         -
Adding the --apply switch will automatically apply the recommendations to the selected workload or namespace.
> backyards sidecar-proxy egress recommend --workload backyards-demo/payments-v1 --apply

INFO[0002] sidecar egress for backyards-demo/payments-v1 set successfully

Sidecar egress rules for backyards-demo/payments-v1

Sidecar                              Hosts                   Bind  Port  Capture Mode
backyards-demo/backyards-demo-yvsj2  [./notifications.backyards-demo.svc.cluster.local istio-system/istio-telemetry.istio-system.svc.cluster.local]      -
Using */* for the workload switch and --apply will automatically restrict every Envoy proxy's outbound listeners. The proxy configuration feature will be available in the next Backyards release. We're working hard to push it out until the end of the year. If you can't wait for that, join our backyards channel on the Banzai Cloud community Slack, and ask for a dev release to try it!

About Backyards

Banzai Cloud's Backyards (now Cisco Service Mesh Manager) is a multi and hybrid-cloud enabled service mesh platform for constructing modern applications. Built on Kubernetes and our Istio operator, it gives you flexibility, portability, and consistency across on-premise datacenters and cloud environments. Use our simple, yet extremely powerful UI and CLI, and experience automated canary releases, traffic shifting, routing, secure service communication, in-depth observability and more, for yourself.

About Banzai Cloud

Banzai Cloud is changing how private clouds are built: simplifying the development, deployment, and scaling of complex applications, and putting the power of Kubernetes and Cloud Native technologies in the hands of developers and enterprises, everywhere. #multicloud #hybridcloud #BanzaiCloud
Subscribe card background
Subscribe
Subscribe to
the Shift!

Get emerging insights on emerging technology straight to your inbox.

Unlocking Multi-Cloud Security: Panoptica's Graph-Based Approach

Discover why security teams rely on Panoptica's graph-based technology to navigate and prioritize risks across multi-cloud landscapes, enhancing accuracy and resilience in safeguarding diverse ecosystems.

thumbnail
I
Subscribe
Subscribe
 to
the Shift
!
Get
emerging insights
on emerging technology straight to your inbox.

The Shift keeps you at the forefront of cloud native modern applications, application security, generative AI, quantum computing, and other groundbreaking innovations that are shaping the future of technology.

Outshift Background