GitOps repos: Five lessons on what to include and what to skip
This article provides a decision framework to help practitioners choose what to manage with GitOps and what to avoid. It is a companion to my previous article on structuring GitOps repositories.
With the advent of Infrastructure as Code (IaC) frameworks like Terraform and Pulumi and their ability to build entire environments from nothing but a well-funded cloud account, the scope of what you can manage with IaC-based approaches has virtually no limits.
Even companies running on-premise hardware can benefit from an extensive library of plugins listed in the Terraform and Pulumi registries. And if a plugin is not available, people can author new plugins using languages and toolkits like HCL, Pulumi’s SDK, and Terraform’s CDK.
Without technology barriers, it is easy to start bringing everything into GitOps repositories and wonder whether some of that data may be better managed elsewhere.
I learned the lessons in the following sections the harder way while establishing our practices. I hope they can help make your decisions a little easier.
Lesson #1: Configuration formats matter
Part of the decision about which configuration to manage with GitOps is the choice of the IaC framework, which should prioritize alignment with the system abstractions.
As an example, I mentioned Terraform in the introduction. Its data structures and syntax are good at mapping IaaS resources such as VPCs, VMs, and network gateways. As you move up the complexity ladder, they may not be adequate to describe the internal configuration of other components, such as Kubernetes clusters.
Terraform’s Kubernetes provider can do the job, but the extra layer of abstraction used to represent cluster configuration makes it a challenging medium for anything beyond a couple of resources.
Once the number of configuration objects increases, I recommend incorporating domain-specific frameworks like Argo CD and Flux into the IaC mix. Those frameworks allow cluster administrators to work with GitOps repositories containing Kubernetes resource definitions.
Takeaway: As the system grows and your IaC starts demanding awkward mappings for new configuration formats, consider incorporating new IaC frameworks that match the way people work.
Lesson #2: Favor behavior before state
Modern infrastructure can be mind-bogglingly extensive, making the complete representation of their desired state virtually impossible. We often need to settle for things that act the same instead of obsessing over making them precisely the same.
For instance, assume a VM running Ubuntu. Only a couple of its configuration files may require modification post-installation for the VM to perform its role in the system.
In that scenario, it is more reasonable to represent the differences between the VM’s desired configuration and their default values in the image template. To make the picture worse, and for a bit of real-world GitOps heresy, a few configuration files may be opaque boxes requiring imperative approaches like Terraform’s remote-exec or Ansible to complete the job.
Take that example further and assume your production system depends on a managed Kubernetes offering like ROSA. Your GitOps repository may specify it needs one of such clusters, and a Terraform plugin will be happy to oblige. However, ROSA’s SRE team retains control of many low-level resources underpinning the cluster, such as VMs, disks, and networking.
In short, we must grudgingly accept that we are often aiming for similar behavior rather than for the desired state.
Lesson #3: Leave out state co-managed by specialized components
The first type of “state” matching this lesson is application data. It is managed inside specialized servers with their own APIs and tooling. Now, let’s broaden the concept from “application data” to “data that has its dedicated system of record, lifecycle, and workflow.”
The distinction is helpful because sometimes, you may need to share configuration management with other components outside the GitOps framework.
Some of the best examples of co-management are the various fields used to control the number of replicas for pods and worker nodes in an autoscaled environment. In those situations, you want most of the workload definition in the GitOps repository while deferring things like replica counts and resource limits to autoscaling components in the cluster. Note you still want the configuration for the autoscaling components managed with GitOps.
Kubernetes addresses that overlap with the “server-side apply” pattern, allowing different components to manage (or co-manage) certain portions of a resource. Argo CD supports that pattern through resource annotations, telling it to ignore entire branches of a resource or specific managed fields during drift detection.
Lesson #4: Avoid secrets and certificates
We often see articles and tutorials extolling the virtues of storing encrypted secrets inside Git repositories, a technique commonly known as “sealed secrets.” Since secrets (and certificates) have their own system of record, lifecycle, and workflows, the previous lesson of avoiding that kind of data in a Git repository still applies.
If you need extra convincing, I wrote an extended version of my reasons for not recommending the practice of using sealed secrets with GitOps.
Lesson #5: Treat the GitOps CI/CD pipeline as code too
I earlier wrote about how it is possible to bring the desired state of the entire system into Git repositories. This section extends the definition of “system” to the GitOps repositories and pipelines underpinning those deployments.
I know this may sound overly abstract, but the increased consistency leads to more productivity and better security.
I particularly like this article from Alexandre Couëdelo showing how one can use Terraform to configure the very Git repositories holding the infrastructure definitions. That kind of approach ensures remote repositories remain configured correctly without risking loopholes that could compromise the entire CI/CD pipeline.
Define the boundaries of your entire system early on and decide on a phased approach to bring configuration into one or more GitOps repositories. Some components will be whole systems within the system, requiring multiple GitOps frameworks to cover everything.
The internal state of some components may be too extensive and dynamic for the complete representation in a Git repository, requiring adaptation of GitOps principles. Accept you will often be aiming at system behavior rather than system state.
Do not store any data already managed in a specialized system of record, such as application data and secrets. Watch for resources co-managed using a server-side apply pattern, then configure your GitOps framework to ignore co-managed fields during drift detection.
And lastly, use GitOps to manage GitOps, adopting Pipeline as Code approaches that extend the benefits of GitOps to your GitOps CI/CD pipelines.