Post-Image

On Azure Platform Automation and DevOps

At this point in time, most of the major cloud providers have created a significant amount of documentation centered around the idea of “cloud adoption frameworks” or similar. Companies, trying to move to the cloud for real or perceived benefits, are still in various stages of their journey. Most, primarily due to the size of their environments, take a fairly heavy-handed project management approach to this journey which ultimately leads to many problems. The first, and likely most obvious, is that they are forced to make design decisions early for which they do not fully understand the impact. These design decisions are then crystalized into policy and act as forced constraints on the system. The second is that they assume that cloud adoption is a journey with a destination, rather than realizing early on that the journey IS the destination. The goal of this blog post is to talk a bit about the latter.

The sheer scale at which the cloud progresses is hard to imagine at the outset of your cloud journey, and typically hits you mid-way through your first implementation. It is important to consider that cloud providers, such as Azure, have 100’s of services, each with their own dedicated development teams (yes, that is plural). These teams are hyper focused on delivering “customer value” through features and enhancements to their respective services. It is almost the norm that, during large project implementation cycles, you will have to go back and revisit your architecture more than once. And this isn’t just limited to infrastructure. For example, Azure Databricks just recently released their new runtime version 7.0 equipped with shiny new Spark 3.0. That likely changes your development goals and introduces breaking changes, at the trade-off of resolved bugs and better performance.

Traditional approaches just simply do not cut it. IT teams, particularly ones with little cloud operations experience, struggle to keep pace with the cadence of change. They are used to a much slower, more controlled change experience. So what do they typically do? They fall back on tried and true methods from a traditional IT perspective. Leaning on forced constraints of corporate policy, they attempt to control what can and can’t be done using a “protect” type approach. Effectively, this blocks users from using the features and services they want to use.

So, what is the solution here? I think that this is the core of the problem that DevOps was (is?) ultimately trying to solve. Keeping in mind that DevOps is more of a culture shift than a set of technology tools or practices, it is an important one that companies need to embrace as they adopt the cloud. Buried deep in the Azure Cloud Adoption Framework documentation, there is a critical design section focused on Azure Platform Automation and DevOps.

When I think about taking a DevOps approach to the cloud, I think about focusing on the changeability of the overall solution. That is to say, how well does the architecture/process support and respond to changes? The OACA CMM Section 17 discusses DevOps, and focuses on the key question:

“How is your cloud architecture defined to support DevOps?"

So, what are some core recommendations on how companies should move forward?

Establish A Cloud DevOps Team

I think it is essential to create and maintain a cloud DevOps team. True to it’s form, it should be a cross-functional team consisting of application, infrastructure, operations, and security personnel. The important point here is that a large percentage of the teams focus should be forward looking. That is to say, analyzing product roadmaps, testing new features in preview for fit, and building/maintaining automation to allow for broader organizational use. The Google SRE book talks a lot about this, and how organizations need to respond when DevOps teams are overloaded with support work.

Adopt a guardrail type approach to cloud solution development

At the speed that the cloud changes, it is almost impossible to analyze every service in exhaustive detail to understand all the security and compliance impacts. Companies need to adopt a guardrail type approach to solution design. This looks something like:

  • Defining reasonable guidelines for a base security policy
  • Making use of cloud-native automation to enforce this policy at scale
  • Adopting a detect/respond type approach to edge cases
  • Providing “security checklists” and/or security training to relevant parties

I hear the security purists taking issue with this approach. What if the guardrails are weak? What is something is missed? The answer I usually respond with is, what is the alternative? Shadow IT (powered by the cloud) is likely a far-worse spot to be in. Give a little to gain a lot.

Define central and federated responsibilities

In order to keep up with the demand, education and responsibility needs to be pushed out (where appropriate) to other teams in the organization. Defining appropriate boundaries helps to ensure security baselines are met while allowing for flexibility in solution design. Consider DevOps and/or Security champions embedded inside project teams. They can focus on the application specific components, bringing the learning back to the centralized teams.

Automation first

Make use of continuous integration and continuous delivery mechanisms at the outset of your projects. It should really be sprint 0. Focus on automating not only the infrastructure, but the application, security, and the policy associated with it. Get comfortable with releasing changes on a regular cadence, working in preview features, and working through promotions over the entire lifecycle (dev/test/prod).

The more comfortable you are with the change process, the better you will be at doing it. Centralize on a set of tools that make sense for your organization (cloud-native, 3rd party such as Terraform, CNAB providers, etc) and dive in. Consider taking a zero-downtime approach to deployments, that ease the constraints on when deployments can be done.

Provide fast feedback

Design mechanisms to disseminate information and provide fast feedback to teams that potentially make mistakes. Treat them as learning opportunities.

Conclusion

As Mr. Robot says, control is the real illusion. This is definitely a true statement in the cloud. Building and designing your cloud solutions/processes to embrace change, from the ground up, is the only way to keep up. It should be step one in any cloud journey.

 

Share This Article

Comments