GitOps

Status: Complete
Category: Infrastructure
Default enforcement: Soft
Author: PushBackLog team

Summary

GitOps is an operational model where the desired state of infrastructure and applications is declared in Git, and automated tooling continuously reconciles the running system to match that declared state. Git becomes the single source of truth: a merge to the configuration repository triggers a deployment; a drift between the live system and the Git state is automatically corrected. GitOps provides a full audit trail, easy rollback, and removes the need for engineers to have direct write access to production systems.

Rationale

Git as the audit trail for infrastructure

When engineers apply changes to infrastructure directly — via kubectl apply, terraform apply, or clicking in a cloud console — the changes are ephemeral. There is no permanent record of who changed what, when, and why. The next engineer to look at the system may have no idea why a particular configuration exists.

GitOps makes every infrastructure change a git commit: author, timestamp, description, and code review are all captured by the version control system. The blame trail is as clear as for application code.

Automated reconciliation prevents drift

In traditional ops, infrastructure can drift from its intended state: someone applies a hot-fix directly in production and forgets to update the configuration repository. The next deployment overwrites the fix. With GitOps, a reconciliation loop continuously compares the live state to the declared state, detects drift, and corrects it automatically. Infrastructure becomes self-healing.

Rollback is a git revert

The rollback procedure for a bungled deployment becomes git revert <commit> — reversing the state declaration and triggering the automation to restore the previous state. No manual steps, no tribal knowledge required.

Guidance

Core GitOps principles

Declarative: the system is described as desired state, not as a sequence of imperative steps
Versioned and immutable: the state is stored in git; history is never rewritten
Pulled automatically: the agent (Flux, ArgoCD) pulls state from git, not pushed from CI
Continuously reconciled: the agent detects and corrects drift from the declared state

Pull-based vs push-based deployments

Approach	Mechanism	GitOps?
Push: CI pipeline runs `kubectl apply`	CI has direct cluster access	Partial
Pull: GitOps operator polls git repository	Operator has cluster access; CI only writes git	True GitOps

True GitOps uses a pull model: the CI pipeline writes manifests to a git repository; the GitOps operator (Flux or ArgoCD) polls the repository and applies changes. This means CI never has direct cluster credentials.

ArgoCD example

# argocd/application.yaml — defines an ArgoCD Application
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: api-service
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/k8s-config
    targetRevision: main
    path: apps/api-service/overlays/production
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true         # Remove resources no longer in git
      selfHeal: true      # Correct drift automatically
    syncOptions:
      - CreateNamespace=true

With selfHeal: true, if someone runs kubectl edit in production and changes a value, ArgoCD will detect the drift and revert it within minutes.

Repository structure

Two common approaches:

Monorepo: application code and kubernetes manifests in the same repository

app/                    # Application source
k8s/
  base/                 # Base Kubernetes manifests (Kustomize base)
  overlays/
    staging/            # Staging-specific patches
    production/         # Production-specific patches

Split repo: separate repository for infrastructure/config

myapp-config/           # Config-only repository
  apps/
    api/
      base/
      overlays/
  infrastructure/
    ingress/
    cert-manager/

Split repos are preferred for larger teams: application PRs and infrastructure PRs have separate approval workflows.

Flux example (Reconciling a HelmRelease)

# flux/helmrelease.yaml
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: api-service
  namespace: production
spec:
  interval: 5m       # Reconcile every 5 minutes
  chart:
    spec:
      chart: api-service
      version: '>=1.0.0'
      sourceRef:
        kind: HelmRepository
        name: myorg-charts
  values:
    image:
      tag: "v1.4.2"
    replicas: 3

Promotion workflow

feature → main → [CI builds image, updates image tag in config repo] → GitOps operator applies to staging
                                         ↓ (manual PR approval)
                                    production config repo ← PR to bump image tag
                                         ↓ (merged by engineer)
                                    GitOps operator applies to production

By separating image builds (CI) from deployment (GitOps pull), each environment’s state is always auditable in git and promotions require an explicit git change.

Review checklist

Infrastructure desired state is declared in git
A GitOps operator (Flux, ArgoCD) handles apply — CI does not call kubectl apply directly to production
Drift detection and self-healing are enabled in production
The config repository requires PR approval before changes merge to the production branch
Rollback procedure is documented: git revert + merge

GitOps

GitOps

Tags

Summary

Rationale

Git as the audit trail for infrastructure

Automated reconciliation prevents drift

Rollback is a git revert

Guidance

Core GitOps principles

Pull-based vs push-based deployments

ArgoCD example

Repository structure

Flux example (Reconciling a HelmRelease)

Promotion workflow

Review checklist