Best of 2021 – When Is Service Mesh Worth It? – Container Journal

Long Live Containerization!
As we close out 2021, we at Container Journal wanted to highlight the most popular articles of the year. Following is the eleventh in our series of the Best of 2021.
Service mesh is getting a lot of interest these days, especially as new meshes enter the market to join Istio, Linkerd and Kuma as established open source options. The technology brings a common networking, policy and observability layer for microservices architecture. Due to its significant overhead, though, the perception is that it’s only relevant for very large-scale enterprise deployments with many teams. But is that truly the case? When is adopting service mesh actually worth it?
I recently met with Zach Butcher, founding engineer at Tetrate, and one of the original Istio builders at Google, to learn what organizational size is best suited for service mesh. Though economies of scale have a more apparent advantage when implementing the technology, Butcher believes that framing it as a question of organizational size is the wrong way to look at things.
To Butcher, operational enhancements gained, such as company-wide security enforcement or replacing traditional API gateways, could recoup the upfront investment. Paired with ongoing usability improvements, service mesh is primed to benefit many more organizations, no matter their magnitude.
“Service mesh can be a force multiplier for large efforts,” says Butcher. Economically, it’s easier to justify for larger teams; yet, even for a smaller group, Butcher believes it can streamline operations such as metering, authorization, authentication and encryption in transit.
“I don’t believe that adopting service mesh is a function of scale,” Butcher says. “Rather, the gate for adopting a service mesh is operational expense.” Organizations should be looking at the cost of an alternative approach, not the scale of their company. This boils down to what the organization is attempting to achieve, and many of these points are independent of size, says Butcher.
For example, service mesh can enforce encryption and security policies that are necessary requirements for most organizations. Also, for Envoy-based meshes, Envoy plugins could help externalize services directly from the mesh, thus avoiding investment in other API gateways.
Butcher shared how FICO uses service mesh to encrypt all data in transit and enforce it at the proxy level. Of course, this is a large company, but the scale is not the driving factor — it’s the operational impact of reusable functionality with unified configurations. “You can have a central team take on and pay that cost, as opposed to spreading cost across organization,” he says.
Identity and access management (IAM) is another area where service mesh shines. Butcher describes how, at Google, implementing a centralized identity and access management solution for all cloud services was a significant ordeal involving many teams’ input. But after adopting sidecar proxies for each service, he and one other engineer were able to deploy IAM across every single object in one quarter.
Perhaps the most exciting effect is the impact on API strategy. When folks discuss service mesh and API gateways, they typically keep them in separate camps. However, Butcher sees a convergence. “We see API gateways as an inherent part of service mesh functionality.”
The old mindset of needing an API gateway for north and south traffic and service mesh for east and west traffic, is no longer a meaningful distinction, Butcher believes. “The problem is north-south and east-west doesn’t exist,” he adds.
As more services grow, the distinction between “external” and “internal” starts to disappear. No matter if it’s public, private or partner, all services may encounter high traffic, they all need SLAs and they all require the same zero-trust security awareness.
Instead of using a separate API gateway, API providers could place sidecars around the microservices they intend to expose and extend them with management features like rate limiting, identity control and request transformation. Operators then could apply different authentication policies on a per-service basis from the mesh control plane. Repurposing service mesh for external communications could thus enable smaller shops to expose API-as-a-Product.
Of course, the cost of running a mesh itself contributes to overhead, but an even higher impediment to adoption is usability. Getting over accessibility hurdles could be a substantial initial cost for small shops. And, Istio doesn’t have the best reputation when it comes to usability.
Butcher acknowledges that many components that led to Istio originally resulted from decomposing a monolithic API gateway while working at Google. “We were building out Istio in a cave,” he says. “No one could use it, and when handled, they cut themselves on it.”
Since then, Istio maintainers have significantly improved how developers interact with the mesh, with out-of-the-box configurations, recipe books and improved documentation on Istio.io. Yet, there is still more to do to improve usability. Increasing the developer experience associated with the technology is the last thing to meaningfully “chip away that cost of adoption,” says Butcher.
Another promising area to improve usability is WebAssembly. As I’ve covered in the past, much effort is centered around WebAssembly as an extensibility mechanism for Envoy. Butcher notes that a “pretty vibrant ecosystem [is] emerging based on WebAssembly.” Ready-made Envoy plugins could help organizations more readily utilize service mesh, thus lowering the barrier to entry for custom needs.
Service mesh brings a unified configuration to an inherently disparate architecture. And as microservices rise in use, even small organizations may feel the need to adopt a central mechanism for universal control over their ingress and egress policies.
If operational returns and time savings outweigh the upfront effort and ongoing maintenance cost, service mesh could be highly valuable, and not only relegated to enterprise use cases.
Logically, further impact analysis will be vital to determine a break-even point. Similar to weighing other cloud-native technologies, cloud economists must focus their attention on service mesh in an ongoing effort to bridle rising cloud costs.
According to Butcher, “API gateways are an inherent part of service mesh.” Thus, he predicts an incoming surge of more API-centric mesh configurations. “It’s the last domino to fall for service mesh to really break through,” he says.
Bill Doerrfeld is a tech journalist and analyst. His beat is cloud technologies, specifically the web API economy. He began researching APIs as an Associate Editor at ProgrammableWeb, and since 2015 has been the Editor at Nordic APIs, a high-impact blog on API strategy for providers. He loves discovering new trends, interviewing key contributors, and researching new technology. He also gets out into the world to speak occasionally.
Bill Doerrfeld has 50 posts and counting. See all posts by Bill Doerrfeld

As we’ve moved from monoliths to microservices, we’ve seen the benefits of decoupling services such as increased developer velocity and more maintainable code. Service independence was also supposed to lead to more reliability, but cascading failures seem more common than ever. Why does this happen, and how can we solve it? The post The Synchronization of Chaos appeared first on DevOps.com. […]
Shifting left – embedding security processes and output into the development pipeline – has been the focus for integrating security into DevOps for some time. Now that enterprises are approaching critical mass in moving their cloud-native applications to production, runtime protection is attracting more attention – especially from attackers. Even as DevOps teams progress in […] The post What’s Next for DevSecOps: ‘Shift Right’ for Runtime Protection appeared first on DevOps.com. […]
Infrastructure-as-code (IaC) was a game-changer in the evolution of infrastructure provisioning, but it didn’t take long for fast-growing organizations to feel the backlash from the “go fast and break stuff” culture and look for solutions to ensure that speed doesn’t come at the expense of quality. The post Closing the IaC Gaps: Securely Scale and Accelerate DevOps Processes appeared first on DevOps.com. […]
Modern business relies on a complex array of digital services to deliver delightful customer experiences. To be competitive, these services must always stay on. Unfortunately, complex systems break. When they do, organizations often struggle to react quickly enough. The post Real-Time Intelligence and Operations for the Digital Enterprise appeared first on DevOps.com. […]
Many traditional methods of delivering secure access to infrastructure resources such as Linux & Windows servers, databases and Kubernetes clusters involve configuring multiple, complex technologies such as VPNs, secret vaults and privileged access management systems. These systems increase IT complexity and unnecessarily reduce security as savvy engineers find workarounds to the agility they impose. But […] The post Reducing Complexity to Increase Infrastructure Security appeared first on DevOps.com. […]
As Kubernetes use grows, so, too, does risk resulting from poorly designed K8s network security policies. Tune in to learn how you can automate the creation of secure K8s network policies and inject those policies into your go-to CI/CD tool(s) like Jenkins. The post DevSecOps Best Practices: Automating K8s Network Security Policy appeared first on Security Boulevard. […]
Shifting left – embedding security processes and output into the development pipeline – has been the focus for integrating security into DevOps for some time. Now that enterprises are approaching critical mass in moving their cloud-native applications to production, runtime protection is attracting more attention – especially from attackers. Even as DevOps teams progress in.. The post What’s Next for DevSecOps: ‘Shift Right’ for Runtime Protection appeared first on Security Boulevard. […]
It’s a common misconception that you don’t have to protect applications and data running on Kubernetes because they’re stateless. But this thinking can lead to application vulnerabilities and critical data loss. As more and more stateful applications are running in Kubernetes environments, they present a tempting foothold for attackers to exfiltrate unprotected data, and they.. The post Protecting Applications Running On Kubernetes appeared first on Security Boulevard. […]
Many traditional methods of delivering secure access to infrastructure resources such as Linux & Windows servers, databases and Kubernetes clusters involve configuring multiple, complex technologies such as VPNs, secret vaults and privileged access management systems. These systems increase IT complexity and unnecessarily reduce security as savvy engineers find workarounds to the agility they impose. But.. The post Reducing Complexity to Increase Infrastructure Security appeared first on Security Boulevard. […]
Moving applications and development to the cloud has delivered operational benefits at scale. Faster release cycles and microservices architectures drive complexity and a need for speed that can only be solved by automation via infrastructure-as-code (IaC). However, deploying new tools creates new attack surfaces, and IaC is no different. The post Securing Infrastructure-as-Code From Tampering and Misconfigurations appeared first on Security Boulevard. […]

source