GitOps with Argo CD and Kustomize: What I'd Build Today

Paul Wechuli
April 22, 2026
gitops
argocd
kubernetes
kustomize
devops

A pragmatic GitOps layout for multi-cluster, multi-environment Kubernetes — including the repo structure, the promotion model, and the parts where I think Argo CD's defaults are wrong.

GitOps with Argo CD and Kustomize: What I'd Build Today

GitOps is no longer controversial. Argo CD is no longer new. And yet most teams I help still have a setup that's some combination of "one giant manifest repo," "manual kubectl apply for the awkward stuff," and "we tried Helm but it got out of hand." Here's the layout I now recommend for a typical platform team running ~30 services across dev / staging / prod, with two or three clusters per environment.

The repo split: apps vs config

The single most important decision is to separate application source from deployment config, into two distinct repositories. Application repos contain code and Dockerfile. The config repo contains Kubernetes manifests, organized by environment. CI in the app repo builds the image, then opens a PR against the config repo bumping the image tag. Argo CD watches the config repo and syncs.

infra-config/
├── apps/
│   ├── payments-api/
│   │   ├── base/
│   │   │   ├── kustomization.yaml
│   │   │   ├── deployment.yaml
│   │   │   ├── service.yaml
│   │   │   └── hpa.yaml
│   │   └── overlays/
│   │       ├── dev/
│   │       ├── staging/
│   │       └── prod/
│   └── notifications/
│       └── ...
├── platform/
│   ├── ingress-nginx/
│   ├── cert-manager/
│   ├── external-secrets/
│   └── argocd/
└── clusters/
    ├── dev-us-east-1/
    │   └── apps.yaml      # ApplicationSet
    ├── staging-us-east-1/
    └── prod-us-east-1/

The reasons to insist on the split:

  • Different blast radius. A bad commit in an app repo should not be able to deploy itself to prod. The config repo can have stricter branch protection, mandatory reviews, and CODEOWNERS that the app repos don't need.
  • Faster app CI. App repos don't need to clone the entire deployment world to build a container.
  • Clean audit trail. Every prod change is a PR in one repo. Compliance loves this.

The argument against — "two PRs to ship a change" — is real but small. A bot opens the second PR automatically. The human cost is approximately zero.

Why Kustomize, not Helm, for first-party apps

I run Helm for third-party stuff (ingress controllers, cert-manager, anything from an upstream chart). For our own apps I use Kustomize, because:

  • Overlays are diffs, not templates. When something looks wrong in prod, I can read the prod overlay and see exactly what's different from base. With Helm I have to mentally render the chart.
  • No templating language to escape. I don't have a "what does this YAML mean after Go templating?" problem.
  • kustomize build is a pure function. I can run it locally, diff against the cluster, and reason about the change.

The Kustomize pattern I keep:

# apps/payments-api/overlays/prod/kustomization.yaml
resources:
  - ../../base
patches:
  - path: deployment-patch.yaml
    target:
      kind: Deployment
      name: payments-api
replicas:
  - name: payments-api
    count: 6
images:
  - name: payments-api
    newTag: v2.4.1
configMapGenerator:
  - name: payments-api-config
    behavior: merge
    literals:
      - LOG_LEVEL=info
      - FEATURE_NEW_CHECKOUT=true

The CI bot only ever touches newTag: in this file. That's the entire promotion mechanism.

ApplicationSets, not Applications

Defining an Argo CD Application per service per cluster does not scale. You get hundreds of YAML files that all look the same. Use ApplicationSet with the git generator instead:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: prod-apps
  namespace: argocd
spec:
  generators:
    - git:
        repoURL: https://github.com/my-org/infra-config
        revision: main
        directories:
          - path: apps/*/overlays/prod
  template:
    metadata:
      name: "{{path[1]}}-prod"
    spec:
      project: prod
      source:
        repoURL: https://github.com/my-org/infra-config
        targetRevision: main
        path: "{{path}}"
      destination:
        server: https://prod-cluster.example.internal
        namespace: "{{path[1]}}"
      syncPolicy:
        automated:
          prune: true
          selfHeal: true

Add a new service? Create apps/new-service/overlays/prod/. It deploys automatically, no Argo CD config change needed.

Where Argo CD's defaults are wrong

A few defaults I always change:

  1. selfHeal: true is dangerous on platform components. If someone is mid-incident and manually edits a Deployment to recover, you don't want Argo CD overwriting their change three seconds later. I leave self-heal on for app workloads, off for platform components.
  2. prune: true without PrunePropagationPolicy=background can stall on finalizers and leave you with half-deleted resources. Always set the propagation policy.
  3. The default sync window is "always." For prod, set a sync window that excludes Friday afternoons and weekend nights. You'd be amazed how many "weekend incidents" started as a Sunday-evening Renovate PR auto-merging.

Promotion: just be a robot

The promotion model I recommend is the simplest thing that works: a bot opens PRs.

  • App CI builds an image, pushes it, then opens a PR in the config repo updating apps/<service>/overlays/dev/kustomization.yaml with the new tag. Auto-merged after checks pass.
  • A nightly job checks "what's in dev that isn't in staging" and opens a single batched PR promoting all of it. Requires one human review.
  • Staging → prod is the same, but requires two reviewers and runs on a manual trigger, not a schedule.

No special promotion tool. No GitOps-y CRD. Just PRs, in a repo, with branch protection. Everything in your normal review and audit pipeline applies.

What I'd still avoid

  • Argo CD managing Argo CD. It can be made to work but the bootstrap story when something breaks is miserable. Manage the Argo CD installation with a small Terraform module instead. Let Argo CD manage everything else.
  • Cross-cluster Application references. Argo CD allows an Application in cluster A to deploy to cluster B. This sounds clever and creates very confusing failure modes. Run one Argo CD per cluster (or per region), federated only by them all reading from the same config repo.
  • Storing secrets in the config repo, even encrypted with SOPS. Use External Secrets Operator with a real secret store. The "encrypted secrets in git" pattern looks fine until you need to rotate, and then it doesn't.

GitOps is a pretty boring practice in 2026, which is exactly what infrastructure should be. The interesting work is in your application code. Your deployment pipeline should be the part you don't think about.