Post-Image

Infrastructure as Config

Infrastructure as Code

This is one of those buzz terms that is all the rage with the advent of public clouds, but the idea has been in practice for a while in the VMware world (PowerCLI). It is the ability to programmatically provision resources using templates, commands, loops, and conditionals. It encompasses both deployments based on templates and pure code, but is often used when referring to the simplified frameworks for template deployment. This article is going to outline where the template based deployment frameworks generally fail when trying to act as “code”.

Infrastructure as Config

I have started referring to these template based deployment frameworks as Infrastructure as Config. They fit the mold of being good at static configuration only. They are best at defining your current state resource configuration.

They work great if you are fine with managing separate templates for each of your deployed resources. In order for this to scale in a larger shop, you will need to find ways to create re-usable templates that have conditional configuration blocks and resources.

Conditionals

Conditionals are the building blocks for making something flexible and re-usable. If something is true, create this resource, if not, ignore and move on.

With the main cloud template frameworks, if you try to implement a lot of re-use using conditionals you quickly run into their limitations.

Developers live and breath conditionals. They are the glue that provides flexibility to a program. Software could not exist without them. Computers could not exist without them. Micro-processors are made up of transistors or logic gates, which are extremely simplified binary conditionals.

Does Template Based Infrastructure as Code Exist?

Terraform is close. Much closer now that version 0.12 has been released. Even still, with the lack of dependency support between modules, it breaks a major requirement for re-usability.

Google Cloud Deployment Manager looks to be the perfect example of Infrastructure as Code. I haven’t had an opportunity to really put it through its paces to find all of its limitations. Too bad it only works for Google’s cloud and not the rest.

Azure Resource Manager (ARM) Templates

ARM templates are based on JSON and follow a schema that defines what can be included in the template. This strict adherence to the JSON structure and the schema is nice for maintaining structure and order, but it limits how far they can expand functionality that would allow it to work as “code”. Only recently, have they strayed from that strict adherence and added support for JSON with comments. JSON is not very readable and dealing with the brackets and commas can be infuriating. I would say it is a step back from XML as a human readable data format.

Pros

  • Parameters can reference KeyVault values which allows you to protect your secrets
  • Supports new functionality quickly
  • Best resource versioning system
  • You can export templates from existing resources

Cons

  • Does not check existing resources and their settings (This can make deployments risky.)
  • Can’t loop on an empty list (If you want to allow for an optional list of data disks when creating a VM. That list will always need to have at least 1 value.)
  • All JSON is validated before conditions are checked (Resources that will not be created will still need to be valid, especially the name.)
  • Dependencies aren’t assumed based on used references or resource nesting (If you create an extension as a sub-resources nested within a VM, you still still need to mark it as dependent on the VM.)
  • Dependency hell when using conditional resources (When the dependsOn property is specified, it can’t be empty. Specifying an optional resources dependency is not possible.)
  • Deployed resources are not managed (You can’t rollback a deployment and you will have to manually delete your resources.)
  • No way to expand the functionality using custom code (Other template frameworks allow for expanding functionality. User-defined functions are useless.)

Some recently introduced resource types such as Blueprints can not be deployed using ARM templates. You need to interface directly with the REST API to create them. This has caused me to question Microsoft’s future plan for the ARM template deployment framework.

Nested Templates

Nested ARM templates have their use when deploying a larger collection of resources.

You can only create resource within a limit of 5 resource groups with nested deployments. Most use cases will not run into this limitation, but for large nested deployments, this can be a problem.

Linked Templates

Allows you to reference another ARM template from within your template. These referenced templates need to exist in a Storage Account.

AWS CloudFormation

This is the most limited template framework available, but its biggest issue is with how quickly it supports new functionality released for AWS. If you want to make use of the latest and greatest functionality in your CloudFormation templates, you will be waiting for 3 - 12 months after the feature’s release for support to be added to CloudFormation. Being proactive and submitting a feature request ticket seems to help speed things along.

Pros

  • JSON and YAML are supported
  • Can reference exported outputs from existing stacks
  • Proposed changes are partially audited against existing resources
  • Deployments are managed and can be updated or rolled back
  • Can expand the functionality using custom resources and macros

Cons

  • Very limited set of functions (Most things you would want to do can’t be done.)
  • Limited conditional syntax (Conditions are awful in CloudFormation. You have to create a condition before you can test for a condition. A resource can only rely on one condition. If you need more, you will need to create a condition that checks multiple conditions. The saving grace is that you can do inline condition checks.)
  • Slow support for new and old functionality (Support for resource tags and descriptions are still missing for many resource types.)
  • Referencing exported stack outputs causes stack dependency locking issues (Stacks whose exported output were referenced, can’t be fully updated without removing all dependent stacks.)
  • Functions are processed before Secure Systems Manager (SSM) parameter references (If you store a list of subnets in an SSM parameter, you will not be able to split that list within a template. The Split function will be performed before the resolve SSM parameter reference is replaced with the parameter value.)

CloudFormation is good at deployment management. The state of your deployment or stack is stored within your account allowing controlled updates, rollbacks, and cross-referencing existing stacks. This is its strongest feature.

In order to make Cloud Formation templates manageable, I have given in and started using custom resources and macros for adding advanced functionality. These are Lambda functions that are triggered by CloudFormation at deployment time. Custom resources allow a Lambda function to handle all steps of the resource’s creation/rollback. Macros are Lambda functions that transform a template before deployment. These are advanced and poorly documented features of CloudFormation and not at all friendly to non-developers.

There are other hurdles with these methods though. The AWS SDKs available in the Lambda back-end are generally outdated by 3 - 9 months.

Nested Templates

These operate similar to ARMs linked templates as in they allow you to reference another Cloud Formation template from within your parent template. These referenced templates need to exist in an S3 bucket and can’t be local files.

Stack Sets

This feature is only useful if you are deploying identical resources across multiple regions or AWS accounts. This is only beneficial to large scale global sized organizations.

Terraform

Terraform uses a very programmatic syntax, but maintaining readability. A lot of its programming features are not very elegantly implemented. They appear to be afterthoughts that were added in in a way that does not inflict any structural changes. An example would be if you were to conditionally choose to not make a resource, you would add an inline condition check for its count property. Not very obvious.

Pros

  • Supports multiple public and private clouds
  • Can reference existing resources
  • Can preview proposed changes that are compared against existing resources
  • Can expand the functionality using external scripts
  • Deployments are managed and can be updated or rolled back

Cons

  • No dependencies between modules (You can’t have your VM module dependent on you network module.)
  • Conditions and loops are ugly (You can now make a block of properties within a resource conditional by using dynamic blocks. First you would have a condition in the locals section create a list variable that is either empty or has a value of 1. You will then pass that list into the for_each property of the dynamic block. Not at all self explanatory. Even they recommend using them sparingly.)
  • Support for new functionality is slower (Lesser known resource types or properties can take a while to get added.)
  • No support for secret values (Even if you source a secret securely and keep it out of the template, it will still end up as clear text in the state file.)

Some of the issues I had with Terraform were solved with the release of v0.12. My biggest gripe with Terraform is still the insecure state file. The remote state file option is a workaround, but is still likely to be poorly implemented. It really needs to support having encrypted values in the state file or references to protected values stored in KeyVaults (Azure) or SSM parameters (AWS).

The v0.12 release was a re-write, as far as they were concerned. I feel it doesn’t fully qualify as such as they were not willing to break backwards compatibility. I really feel they need to treat the current 0.x code-base as a test run and do a full re-think and re-write from the ground up for the 1.0 release.

Terraform Cloud/Enterprise

This is a publicly or privately hosted management solution for your Terraform templates and state files. It also comes with a graphical interface for creating a deployment template, containing only modules, called Configuration Designer.

Modules

Modules allow for the re-use of a collection of variables and templates and they can be local or exist on external cloud storage or code repository. They have a major limitation. You can’t have them depend on other modules. Each module you deploy will need to be self contained or you will need to deploy them separately, defeating the purpose of using modules.

Google Cloud Deployment Manager

Google Cloud’s Deployment Manager is the only framework that I would label as Infrastructure as Code. It is bases on Python 3.x and you can create you templates in pure Python or Jinja2. Jinja2 is YAML with support for inline scripting. This gives you the flexibility of code with the readability of YAML.

Pros

  • Functionality can be easily expanded
  • Deployments are managed and can be updated or rolled back
  • Imported templates can exist locally
  • Can create additional resource types using a type provider

Cons

  • Limited supported base resource types (It is hard to say how many are missing as I can’t find an official list of GCP resource types.)
  • Documentation not very non-developer friendly (Someone from the traditional IT infrastructure world would be lost.)
  • Not all documentation includes Jinja2/YAML examples (The documentation seems to follow a Python first, YAML sometimes approach.)
  • Poorly designed resource versioning (The version of the resource type is included in the naming. A history of versions is not available and not all resource types have a stable release.)
  • Some resource types are stuck in a beta state for an extended period (Some resource types (sqladmin) have been in beta for the last 2 years without a stable release. GCP seems to follow Google’s constant beta approach. Hopefully GCP doesn’t end up in Goggle’s extensive cancelled product grave yard.)

Imported Templates

You can import other templates from within your template. These referenced templates are pulled from the local folder.

Alternatives

PowerShell Generated ARM Templates

In Azure, to properly support conditionals and save yourself from the dependency hell, you will need to create PowerShell scripts to deploy your ARM templates. Once you get to this point, you will start to wonder why you are even using ARM templates, as there is limited deployment management with ARM deployments. You might as well create resources with pure PowerShell and the Az module.

AWS Cloud Deployment Kit (CDK)

This is a set of tools for developing Python code to generate CloudFormation templates. It is very similar to Troposphere. Amazon must have came to the same conclusion, that CloudFormation isn’t truly going to pass as Infrastructure as Code. The setup of the CDK requires a bit of work and is still hamstrung by the limitations of CloudFormation. Since this is pure Python and the setup is very foreign to non-developers, you need to be a developer to manage and understand it. This will be a road block to adoption by the traditional infrastructure crowd.

Software Development Kit (SDK) Deployments

This applies to all clouds that provide an SDK in your favourite language. This is the most flexible and efficient form of Infrastructure as Code. The downside is you lose the tracking of deployments and the templates defining how resources were created, that may be required by managers or auditors.

REST API Deployments

All public clouds have an Application Programming Interface (API) that usually follow the REST communication style. For those who want to be on the cutting edge and have a lot of time on their hands, this is for you. With this method you will be recreating the same calls to the HTTPS REST APIs that the SDK would perform. You will not be limited by the release schedule of the SDK when adding support for new functionality. The documentation on how to properly interface with the APIs are sometime incomplete or cryptic. If you ever need support, you will generally be directed to use the SDK instead.



 

About James MacGowan

James started out as a web developer with an interest in hardware and open sourced software development. He made the switch to IT infrastructure and spent many years with server virtualization, networking, storage, and domain management.
After exhausting all challenges and learning opportunities provided by traditional IT infrastructure and a desire to fully utilize his developer background, he made the switch to cloud computing.
For the last 3 years he has been dedicated to automating and providing secure cloud solutions on AWS and Azure for our clients.

Share This Article

Comments