GitOps Repos: Five Lessons on What To Include and What To Skip

Denilson Nastacio
5 min readMay 23, 2022

This article provides a decision framework to help practitioners choose what to manage with GitOps and what to avoid. It is a companion to my previous article on structuring GitOps repositories.

With the advent of Infrastructure as Code (IaC) frameworks like Terraform and Pulumi and their ability to build entire environments from nothing but a well-funded cloud account, the scope of what you can manage with IaC-based approaches has virtually no limits.

Even companies running on-premise hardware can benefit from an extensive library of plugins listed in the Terraform and Pulumi registries. And if a plugin is not available, people can author new plugins using languages and toolkits like HCL, Pulumi’s SDK, and Terraform’s CDK.

Without technology barriers, it is easy to start bringing everything into GitOps repositories and wonder whether some of that data may be better managed elsewhere.

I learned the lessons in the following sections the harder way while establishing our practices. I hope they can help make your decisions a little easier.

Lesson #1: Configuration Formats Matter

Part of the decision about which configuration to manage with GitOps is the choice of the IaC framework, which should prioritize alignment with the system abstractions.

As an example, I mentioned Terraform in the introduction. Its data structures and syntax are good at mapping IaaS resources such as VPCs, VMs, and network gateways. As you move up the complexity ladder, they may not be adequate to describe the internal configuration of other components, such as Kubernetes clusters.

Terraform’s Kubernetes provider can do the job, but the extra layer of abstraction used to represent cluster configuration makes it a challenging medium for anything beyond a couple of resources.

Once the number of configuration objects increases, I recommend incorporating domain-specific frameworks like Argo CD and Flux into the IaC mix. Those frameworks allow cluster administrators to work with GitOps repositories containing Kubernetes resource definitions.

Two-part picture. Left side has a repo administrator staring at a circle representing the entire system. The circle contains a directed graph with circles and squares containing single alphabet letters. Right-side of picture has a tree-like structure of folders, containing two parent folders (VMs and clusters,) each containing a few alphabet letters mapping to the letters in the left-side of the picture. Robots labeled “Terraform” and “Argo” look at the “VMs” and “clusters” folders, respectively
It may be impractical to use a single IaC framework in a large environment due to the specialized needs of complex components. For instance, Terraform excels at allocating resources at the IaaS level, while Argo CD excels at managing Kubernetes configuration.

Takeaway: As the system grows and your IaC starts demanding awkward mappings for new configuration formats, consider incorporating new IaC frameworks that match the way people work.

Lesson #2: Favor Behavior Before State

Modern infrastructure can be mind-bogglingly extensive, making the complete representation of their desired state virtually impossible. We often need to settle for things that act the same instead of obsessing over making them precisely the same.

For instance, assume a VM running Ubuntu. Only a couple of its configuration files may require modification post-installation for the VM to perform its role in the system.

In that scenario, it is more reasonable to represent the differences between the VM’s desired configuration and their default values in the image template. To make the picture worse, and for a bit of real-world GitOps heresy, a few configuration files may be opaque boxes requiring imperative approaches like Terraform’s remote-exec or Ansible to complete the job.

Take that example further and assume your production system depends on a managed Kubernetes offering like ROSA. Your GitOps repository may specify it needs one of such clusters, and a Terraform plugin will be happy to oblige. However, ROSA’s SRE team retains control of many low-level resources underpinning the cluster, such as VMs, disks, and networking.

Folder structure with “VMs” and “clusters” at the first level. The “VMs” folder has a highlighted “G” folder with a callout showing its definition being “Ubunty 16.04” plus changes to files in “/etc/hosts” plus an invocation of Terraform’s “remote-exec” primitive.
GitOps principles call for declarative and immutable settings. Still, sometimes you need to bend those principles and store configuration relative to baselines that may be entirely or partially outside your control.

In short, we must grudgingly accept that we are often aiming for similar behavior rather than for the desired state.

Lesson #3: Leave Out State Co-managed by Specialized Components

The first type of “state” matching this lesson is application data. It is managed inside specialized servers with their own APIs and tooling. Now, let’s broaden the concept from “application data” to “data that has its dedicated system of record, lifecycle, and workflow.”

The distinction is helpful because sometimes, you may need to share configuration management with other components outside the GitOps framework.

Some of the best examples of co-management are the various fields used to control the number of replicas for pods and worker nodes in an autoscaled environment. In those situations, you want most of the workload definition in the GitOps repository while deferring things like replica counts and resource limits to autoscaling components in the cluster. Note you still want the configuration for the autoscaling components managed with GitOps.

A circle represents the whole environment, containing a directed graph. This is the same circle depicted atop the article, but now the smaller circle labeled “F” is at the edge of the larger circle, and a robot inspects the configuration for the smaller circle and then sets the number of “replicas” of the circle to “5”. An admin responsible for the whole environment watches the robot and allows it  to proceed.
Automated processes and components may be better positioned to configure or tune details of some aspects of the infrastructure.

Kubernetes addresses that overlap with the “server-side apply” pattern, allowing different components to manage (or co-manage) certain portions of a resource. Argo CD supports that pattern through resource annotations, telling it to ignore entire branches of a resource or specific managed fields during drift detection.

Lesson #4: Avoid Secrets and Certificates

We often see articles and tutorials extolling the virtues of storing encrypted secrets inside Git repositories, a technique commonly known as “sealed secrets.” Since secrets (and certificates) have their own system of record, lifecycle, and workflows, the previous lesson of avoiding that kind of data in a Git repository still applies.

If you need extra convincing, I wrote an extended version of my reasons for not recommending the practice of using sealed secrets with GitOps.

Lesson #5: Treat the GitOps CI/CD Pipeline as Code Too

I earlier wrote about how it is possible to bring the desired state of the entire system into Git repositories. This section extends the definition of “system” to the GitOps repositories and pipelines underpinning those deployments.

I know this may sound overly abstract, but the increased consistency leads to more productivity and better security.

Pipeline as Code technologies such as Argo Workflows, Tekton, and GitHub Actions are good choices for applying GitOps principles to pipelines.

I particularly like this article from Alexandre Couëdelo showing how one can use Terraform to configure the very Git repositories holding the infrastructure definitions. That kind of approach ensures remote repositories remain configured correctly without risking loopholes that could compromise the entire CI/CD pipeline.

The entire overview picture at the beginning of the article is depicted inside a larger circle, representing the whole system. On the right side of the entire picture has a robot tending to a tree structure containing top-level folders for pipeline-as-code technologies, such as Argo and Tekton. These folders contain the pipeline definitions for the GitOps folders.
GitOps can also manage the artifacts behind the primary GitOps process, supporting versioned and consistent settings for CI/CD pipelines.

Conclusion

Define the boundaries of your entire system early on and decide on a phased approach to bring configuration into one or more GitOps repositories. Some components will be whole systems within the system, requiring multiple GitOps frameworks to cover everything.

The internal state of some components may be too extensive and dynamic for the complete representation in a Git repository, requiring adaptation of GitOps principles. Accept you will often be aiming at system behavior rather than system state.

Do not store any data already managed in a specialized system of record, such as application data and secrets. Watch for resources co-managed using a server-side apply pattern, then configure your GitOps framework to ignore co-managed fields during drift detection.

And lastly, use GitOps to manage GitOps, adopting Pipeline as Code approaches that extend the benefits of GitOps to your GitOps CI/CD pipelines.

--

--

Denilson Nastacio

Operations architect, corporate observer, software engineer, inventor. @dnastacio