Navigating the Service Meshes Map 🔗︎
A “Service Mesh” is an infrastructure layer regulating the interactions and relationships between applications or microservices. Rather than a source of fundamentally new features, it provides a repackaging of functionalities such as request-level load balancing, circuit-breaking, retries, instrumentation, and others. When developing cloud-native or hybrid applications, DevOps increasingly relies on service meshes to abstract application network functions from the code. Born as a facilitator for orchestration in the wake of Kubernetes and other container technology, service meshes are rapidly becoming an indispensable tool for containerization. They enable DevOps teams to focus on building added value services in distributed architectures that are ready to scale with built-in predictability and consistency across platforms. From a security perspective, service meshes are instrumental in enforcing compliance and best practices, alleviating SOC team’s workload and improving resilience while simplifying vulnerability identification and remediation. The ever increasing adoption of public cloud services has created a novel set of complexities stemming from the cloud architectural paradigm. These consist of a collection of interconnected microservices in constant communication and collaboration. The exponentially greater number of endpoints and interactions to monitor, secure, and scale, generated a debugging bottleneck and a new set of security vulnerabilities. Service Meshes emerged as a solution to address these emerging issues.
What Do Service Meshes Do? 🔗︎
When migrating from a monolithic architecture to a hybrid or cloud-native one, DevOps needs to adapt to a methodology capable of incorporating the management of communication between a collection of microservices and safeguard and monitor the drastically increased number of endpoints without compromising on scaling abilities or expanding debugging time or resource requirements. Service meshes are designed to address these issues. From streamlining traffic management by, for example, eliminating the necessity for gateway updates when adding microservices, to reducing complexity by abstracting common infrastructure-related functionalities to a different layer, they provide features that make them near indispensable for cloud-native and hybrid application development. Currently, service meshes most popular capabilities are:
- ‘‘‘Traffic management’’’ – Connecting and controlling the traffic flow and API calls between services
- ‘‘‘Security’’’ – Enforcing authentication to secure bi-directional traffic between client and server
- ‘‘‘Access Control’’’ – Applying and enforcing policies and resource distribution
- ‘‘‘Observability’’’ – Inferring the system’s internal states from external outputs
Depending on the intended application development specifications, the DevOps team needs to select a Service Mesh that optimally matches business and technical requirements. First available on the market was the Service Mesh Istio, and it is one of the best known to date. There are other key players though and It might be worth comparing their differences in architectures and consider the pros and cons of Consul vs Istio, Linkerd vs Istio, Linkerd vs Consul for example, as well as others .
As Service Meshes are less than a decade old, there are only a small number of options, with a significant overlap in the fundamental concepts, but each one privileges a different angle, and they have varying degrees of interoperability and pricing implications ranging from entirely free to premium.
The leading solutions today are:
- ‘‘‘Istio’'': A full open-source solution founded by IBM, Google and Lyft
- ‘‘‘App Mesh’'': Exclusive to AWS
- ‘‘‘Linkerd’'': Initially developed by Twitter for internal use, in 2017 it was made open-source and donated to the CNCF
- ‘‘‘Consul Connect’'': Open-source with a premium paid service
- ‘‘‘SMI’’’ (Microsoft Service Mesh Interface): Announced at KubeCon in 2019, it is backed by heavy players such as Linkerd, HashiCorp, Colo.io, and VMWare, it was Kong: An open-source service mesh named Kuma announced in September 2019
When shopping for a Service Mesh solution, defining a clear set of priorities before the initial exploratory survey can help streamline the process. Some of the priorities to consider before selecting a service mesh solution include:
- ‘‘‘Managed or self-managed’’’ Deploying Kubernetes clusters with a managed service is easy but comes at the cost of losing control over some of the cluster control pane. Selecting either requires assessing the pros and cons and evaluating the cost in IT management relative to the benefits of added flexibility
- ‘‘‘Full, partial open-source or proprietary’’’ Open-source platforms are typically more flexible but might be harder to operate, whereas proprietary ones have more limits and are not free. There is no one size fits all, so the optimal option for a specific project depends on factors such as cost evaluation, necessity for flexibility, availability of IT resources, and more
- ‘‘‘Multi-cluster expansion’’’ Larger projects might require multi-cluster expansion, and smaller ones might need it to scale. When selecting a Service Mesh service, it is always good practice to analyze their multi-cluster expansion capabilities
- ‘‘‘Level of automation’’’ Automation saves time and can also tighten security. Different projects require different types of automation, so checking what automation options are included in a service mesh solution should be part of the selection process
- ‘‘‘Level of built-in security functionalities’’’ Kubernetes built-in security is lacking, and tightening security implies taking additional measures. Service mesh solutions typically provide some security functionalities that address different priorities
- ‘‘‘Type and extend of authentication’’’ Authentication is a critical element of security. A projects’ type, complexity, and scope dictate the authentication features required
- ‘‘‘Observability’’’ Critical to keep a comprehensive view of services health and performance, observability relies on obtaining telemetry data to monitor latency, traffic, errors, and saturation. Choosing between built-in observability, compatibility with external observability solutions or in-house observability configuration are factors dictated by the project’s priorities and should be taken into account when selecting a service mesh solution
- ‘‘‘Interoperability’’’ As the popularity of service mesh grows and new services are emerging, interoperability becomes increasingly critical to enable the interconnection of multiple workloads. Service mesh solutions have various degrees of interoperability that should be factored in when selecting a provider
To accelerate the selection process, this ebook contains an in-depth overview of each of these service mesh solutions, detailing their specific features, pros and cons, and providing a snapshot of their distinctive architecture.