The first article in this series covered how developers have to deal with more than just code in a cloud native world. It shared a look at cloud native observability (o11y) and touched on what the three pillars are versus the three phases of observability.
This second article takes you out onto the playing field where you need to understand who the players are and what teams they form. It’s no longer a world full of developers and operations teams as the cloud native environments have pushed right on through those traditional walls.
The basic introduction started from the point that developers are in a world without clouds and then made the transition to a cloud native development world. What’s this mean for them and what are some of the challenges they are having to embrace?
The playing field
Over time the traditional developer and operations teams saw a transition to different ways of working in the cloud native world. The developers transitioned into DevOps teams where the operations activities merge and attempts are made with process agility. Operations teams having tried DevOps then move to a more mature structure called CloudOps with a clear focus on cloud infrastructure. Finally, we’re seeing today a role emerge known as Site Reliability Engineer (SRE), who’s part of a team that is focused on a broader spectrum of modern resource reliability and not just for the organization’s cloud infrastructure.
Let’s look at each one, shall we?
DevOps is a first step on the road to cloud native operations and bridges both development and operations teams. In this definition you see that they have a specific mandate.
“DevOps is primarily the automation and optimization of the application development lifecycle, including post-launch fixes and updates. It uses continuous development, integration, testing, and deployment of cloud, computer, and downloadable applications. It also focuses on IT operations as they relate to application performance and availability.”
By bringing operations and development closer to focus on processes and automation, they are making the push for agility, reliability, and speed for business goals within their organization. It remains focused, often due to the existence of more than just the cloud native infrastructure, on application development and delivery.
This definition put’s CloudOps in the center of a business operational focus.
“…CloudOps provides organizations with proper (cloud) resource management. In an organization, CloudOps uses DevOps principles and IT operations applied to a cloud-based architecture to speed up the business processes.”
This is a shift towards operations focusing on the cloud native infrastructure more specifically than the other possible infrastructures available in an organization. Once the footprint of dependency on infrastructure choices from the past has been reduced, these teams are scaled up to ensure the improvement of development architecture (infrastructure in the cloud). They focus on simplification of cloud provisioning, application deployment to the cloud, and are big users of observability platforms for both application and infrastructure in the cloud.
Site reliability teams
Oscar Wilde once said, “With age comes wisdom, but sometimes age comes alone.” As organizations become more active in a cloud native world and scale up to full CloudOps teams alongside their DevOps teams, there is another role emerging to fill a gap left behind. That role is an SRE and they don’t only focus on the cloud native infrastructure.
“Instead, an SRE is an all-purpose role that aims to manage reliability for any type of environment.”
SRE’s have to use both IT operations and development strategies to ensure that there is a focus on one thing, and one thing only, that of reliability. It’s a full time job avoiding downtime and optimizing performance of all applications and supporting infrastructure regardless of it being in the cloud native world or not. Together with CloudOps teams they are a very active player in cloud native observability and the platforms used to assist them. They have a vested interest in cloud or multi-cloud security, costs, deployment automation, and all things that help observability at scale.
The observability game
This takes us from the basic introduction, followed by a tour of the o11y playing field, and finally you’ve met the players on the teams involved in cloud native o11y.
Next up, I want to dive deeper into the pillars of monitoring and why at scale you might want to start thinking about the phases of cloud native o11y instead.
Published on Java Code Geeks with permission by Eric Schabell, partner at our JCG program. See the original article here: O11y Guide – Who are the Cloud Native Observability Players?
Opinions expressed by Java Code Geeks contributors are their own.