Running Kubernetes well is deceptively hard. The control plane is just the beginning — you still have to care about node pool sizing, OS patching, add-on upgrades, autoscaler tuning, CNI compatibility, observability, and a hundred other things that have nothing to do with your actual workloads. Both AWS and Google Cloud have responded to this with “managed mode” cluster variants: EKS Auto Mode and GKE Autopilot. The pitch is the same: hand us the node layer and we’ll keep it running, just the way you they like.
This post digs into what each managed mode actually takes over, what it locks down, and where you still need a Standard cluster. Primarily around the trade offs and restrictions of each managed mode, and the cost implications of those trade offs.
GKE: Autopilot vs Standard
Google launched Autopilot in March 2021, so it has had several years to mature. The core idea is that you describe workloads and GKE manages every node lifecycle decision — instance type selection, bin-packing, scaling, upgrades — invisibly.
What Autopilot manages
- Node provisioning and deletion — you never create a node pool, ever. Autopilot selects Compute Engine instance types based on your Pod resource requests and schedules Pods onto appropriately sized nodes.
- OS and Kubernetes upgrades — nodes are automatically upgraded on Google’s release channel schedule. You choose the channel (Rapid/Regular/Stable) but not the specific version window.
- Security hardening — Autopilot enforces Pod Security Standard “Restricted” cluster-wide. This is enforced, not advisory meaning workloads will need to meet strict security constraints (no privileged containers, no
hostPathvolumes, no host networking) to run in Autopilot. - Core Add-ons — the following are pre-installed and Google-managed:
- DataPlane V2 (GKE’s eBPF-based CNI, built on Cilium)
- GKE Ingress Controller
- CoreDNS
kube-proxy- DNS Local Cache
- Workload Identity (GKE’s implementation of Kubernetes Pod Identity)
- Compute Engine Persistent Disk CSI Driver
- Filestore CSI Driver
- Cloud Logging/Monitoring Agent
Workload restrictions
Because Autopilot enforces Restricted PSS, a meaningful set of workload configurations are simply rejected at admission time:
| Capability | Autopilot | Standard |
|---|---|---|
privileged: true containers | No | Yes |
hostPath volume mounts | No | Yes |
hostNetwork: true | No | Yes |
hostPID: true / hostIPC: true | No | Yes |
| DaemonSets | Yes (constrained) | Yes (full) |
| Custom OS image | No | Yes |
| Max pods per node | 256 | Configurable |
| Spot /Preemptible nodes | Yes (Specify Spot Pods) | Yes (manual) |
| Windows node pools | No | Yes |
| ARM node pools | Yes | Yes |
| Nvidia/TPU GPU workloads | Yes (node-level billing) | Yes |
DaemonSets are allowed in Autopilot but inherit the same security constraints — they cannot run privileged, cannot use hostPath, and cannot access the underlying node. This rules out most traditional log forwarder or monitoring agent DaemonSets in their default form. Alternatives like the GKE-managed Cloud Logging and Cloud Monitoring agents are available and don’t require DaemonSets at all.
Networking
Autopilot clusters always use Dataplane V2 (GKE’s eBPF-based networking layer, built on Cilium). This brings two things worth knowing:
- All pod-to-pod traffic is subject to VPC firewall rules, even traffic that never leaves a node — a meaningful security improvement over
iptables-based CNI. - NetworkPolicy is enforced with eBPF rather than iptables, which is lower latency and more observable.
You cannot swap the CNI in Autopilot. In Standard, Dataplane V2 is opt-in, and you can also bring Cilium in managed or self-managed form.
The billing model
This is where Autopilot genuinely differentiates itself. You are billed for requested pod CPU and memory, not for node capacity. If a Pod requests 100m CPU and 128Mi memory, you pay for that — even if it runs on a node with 8 cores.
For general-purpose workloads, minimum requests apply:
- 250m CPU / 512Mi memory per container
For GPU and TPU workloads, billing switches to node-level (the full node is allocated to your Pod).
Why this matters for bursty workloads: if your cluster idles at 20% utilisation most of the day, Autopilot costs track that 20%. A Standard cluster with pre-provisioned node pools bills for 100% of node capacity whether or not anyone is using it.
Example: a Standard cluster with 3× n2-standard-8 (8 vCPU / 32GB) nodes costs roughly $650/month in Sydney regardless of load. The equivalent Autopilot cluster running workloads that consume 8 vCPU / 24GB on average would cost roughly $280/month.
With GKE Standard, you pay for the whole bucket regardless of how much water is in it, but for GKE Autopilot you pay only for the volume of water in the bucket.
The control plane costs are now the same between Autopilot and Standard — $72/month for the control plane.
When to choose Autopilot vs Standard (GKE)
Autopilot is the right call when:
- Your workloads comply with Pod Security Standard Restricted (or can be adapted to)
- You want predictable cost that scales with usage, not with node fleet size
- You don’t need custom OS images or privileged containers
- Platform / SRE overhead is a genuine concern
Stick with Standard when:
- You run privileged workloads (legacy log shippers, eBPF-based APM agents, storage operators)
- You need host-level access (hostNetwork-dependent network appliances, hardware offload)
- You need Windows node pools
- You want full CNI control (e.g. self-managed Cilium with custom hubble configuration)
- Compliance requirements mandate node-level isolation or OS configuration
EKS: Auto Mode vs Standard
EKS Auto Mode launched in late 2024. The headline is similar to GKE Autopilot — AWS manages the node lifecycle — but the mechanism is very different. Where Autopilot is a separate cluster mode with its own scheduler, EKS Auto Mode is managed Karpenter. This distinction has significant engineering implications.
What Auto Mode manages
- Karpenter — installed and operated by AWS on your behalf. You cannot install, upgrade, or configure Karpenter yourself.
- Core add-ons — the following are pre-installed and AWS-managed: VPC CNI (
aws-vpc-cni), CoreDNS,kube-proxy, AWS Load Balancer Controller, EBS CSI Driver, Pod Identity Agent. - Node group lifecycle — nodes are launched by Karpenter. AWS manages OS patching and rotates nodes with a 21-day maximum node lifetime.
- AMIs — AWS manages the AMI. Custom AMIs are not supported. SSH access to nodes is not available.
- Core Add-ons — the following are pre-installed and AWS-managed:
- AWS VPC CNI (
aws-vpc-cni) - CoreDNS
kube-proxy- AWS Load Balancer Controller (Ingress)
- AWS Pod Identity Agent (AWS’s implementation of Kubernetes Pod Identity)
- EBS CSI Driver
- VPC Resource Controller (AWS IPAM integration)
- AWS VPC CNI (
The Karpenter trade-off
This is the part that matters most for experienced EKS operators. Karpenter is one of the most impactful tools in the Kubernetes ecosystem — but Auto Mode’s managed version strips out the majority of its configuration surface.
What you lose in Auto Mode:
| Configuration | EKS Auto Mode | Self-Managed Karpenter |
|---|---|---|
| Pin Karpenter version | No (AWS upgrades it) | Yes |
Custom NodePool resources | No — only general-purpose and system built-in NodePools | Yes |
Custom EC2NodeClass | No | Yes |
| Consolidation policy tuning | No | Yes |
| Disruption budget configuration | No | Yes |
| Spot instance support | No (On-Demand only) | Yes |
| Karpenter webhooks / custom logic | No | Yes |
nodeAffinity to specific instance families | M, C, R, and Graviton families only | ARM/x86/Graviton, any |
The spot instance restriction is the one that stings most in production. A typical cost-optimised EKS Standard cluster targets 70–80% spot capacity, which for a 100-vCPU cluster can reduce the node cost by 60–70%. Auto Mode mandates On-Demand only.
What you gain: you never have to reason about Karpenter upgrades, NodePool drift, or Webhook certificate rotation. For small teams without dedicated platform engineers, that simplicity is real.
Here is what a self-managed Karpenter NodePool might look like compared to what Auto Mode gives you:
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: general
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: node.kubernetes.io/instance-type
operator: In
values: ["m5.xlarge", "m5.2xlarge", "m6i.xlarge", "r5.xlarge"]
- key: eks.amazonaws.com/nodegroup
operator: DoesNotExist
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 30s
budgets:
- nodes: "20%"
In EKS Auto Mode, there is no NodePool manifest you write or apply. AWS presents two fixed NodePools (general-purpose and system) with opaque internal configuration. You can influence scheduling through standard Kubernetes mechanisms (nodeSelector, tolerations, resource requests), but you cannot reach into Karpenter’s behaviour.
Add-on restrictions
The pre-installed add-on stack covers the common baseline, but a number of things you might want in a production cluster are not managed by Auto Mode and must be self-installed — or are blocked entirely:
| Add-on | Auto Mode | Standard | Notes |
|---|---|---|---|
| VPC CNI | Managed | Self-managed | Auto Mode version is locked |
| AWS Load Balancer Controller | Managed | Optional | — |
| EBS CSI | Managed | Optional | — |
| EFS CSI | Not included | Optional | Must self-install if needed |
| Cluster Autoscaler | Not available | Optional | Karpenter replaces it |
| Cilium / Calico | Not supported | Yes | CNI is locked to VPC CNI |
| KEDA | Not included | Optional | Must self-install |
| Cluster Proportional Autoscaler | Not included | Optional | Must self-install |
The locked CNI is a meaningful restriction for teams who have invested in Cilium for network policy enforcement, Hubble observability, or BGP-based routing.
GPU and specialty hardware
EKS Auto Mode does not support GPU instances in the managed node layer. If you need GPU workloads (g4dn, p3, p4d, inf1, trn1 etc.), you must add self-managed node groups or standard Karpenter-managed node pools — at which point you are operating a hybrid cluster.
In practice this means clusters with heterogeneous workloads (CPU services + GPU inference) cannot run entirely in Auto Mode.
Node constraints summary
| Constraint | EKS Auto Mode | EKS Standard |
|---|---|---|
| Max node lifetime | 21 days | Unlimited |
| Custom AMI | No | Yes |
| Custom kubelet configuration | No | Yes |
| Instance families | M, C, R, Graviton | Any |
| Spot instances | No | Yes |
| Bare metal | No | Yes |
| Arm node pools | Yes | Yes |
| GPU instances | No | Yes |
The cost model
Auto Mode charges an additional management fee on top of standard EC2 and the EKS control plane:
- EKS control plane: $72/month
- Auto Mode management fee: approximately $0.10–0.12/vCPU/hour (varies by region)
- For a 100-vCPU cluster running 730 hours/month: ~$180–$216 in management fees alone
Compared to a self-managed Karpenter setup on EKS Standard running equivalent workloads with 70% spot:
| Cluster type | 100 vCPU cost estimate (Sydney region) |
|---|---|
| EKS Standard + Karpenter (70% spot) | ~$520/month |
| EKS Auto Mode (On-Demand only) | ~$1,150/month |
| EKS Standard (On-Demand only) | ~$900/month |
Auto Mode is meaningfully more expensive than Standard — both because of the management fee and because spot is unavailable. The trade is operational simplicity for a 2–2.5× cost multiplier at scale.
When to choose Auto Mode vs Standard (EKS)
Auto Mode is the right call when:
- Your team does not have Karpenter expertise or bandwidth to maintain it
- Your workloads are CPU/memory only (no GPU, no custom hardware)
- Instance family restrictions (M/C/R/Graviton) match your workload profile
- You don’t need custom CNI, custom ingress, or EFS
- Operational simplicity outweighs cost optimisation
Stick with Standard when:
- You need spot instances for cost (this alone is often decisive)
- You run GPU or ML inference workloads
- You need Karpenter
NodePooltuning (consolidation policy, multi-architecture, custom disruption budgets) - You use Cilium, Calico, or any non-VPC CNI
- You need Nginx Ingress, KEDA, or other cluster-level tooling that conflicts with Auto Mode’s managed stack
- You have compliance requirements around node access or OS hardening
Side-by-side comparison
| Dimension | GKE Autopilot | GKE Standard | EKS Auto Mode | EKS Standard |
|---|---|---|---|---|
| Node management | Fully Managed | Self-Managed | Managed Karpenter | Self-Managed |
| Autoscaler mechanism | Proprietary | CA or Karpenter | Managed Karpenter (locked) | CA or Karpenter |
| Node pool control | None | Full | None | Full |
| Spot / Preemptible nodes | Yes (automatic) | Yes | No | Yes |
| GPU / TPU workloads | Yes (node billing) | Yes | No (Self-Managed) | Yes |
| DaemonSets | Constrained | Full | Full | Full |
| Privileged containers | No | Yes | Yes | Yes |
hostPath volumes | No | Yes | Yes | Yes |
| CNI flexibility | No (Dataplane V2) | Yes (incl. Cilium) | No (VPC CNI only) | Yes (incl. Cilium) |
| OpenTelemetry Hubble | Yes (with Dataplane V2) | Yes (Self-Managed) | No | Yes (self-Managed) |
| Custom AMI | No (Only COS) | Yes | No (Only Bottlerocket) | Yes |
| Max node lifetime | NA* | Unlimited | 21 days | Unlimited |
| Ingress options | Any | Any | ALB only | Any |
| Cost model | Per-pod request | Per-node | Per-vCPU mgmt fee + On-Demand | Flexible |
| Best for | Compliant bursty workloads | Any workload | Simple CPU workloads | Any workload |
- *Nodes are fully managed and opaque in Autopilot, so there is no concept of node lifetime from the operator’s perspective. However, Google does rotate nodes regularly as part of their management process. Weekly for Security Patches, Monthly for Kubernetes Version.
Conclusions
Both EKS Auto Mode and GKE Autopilot solve a real problem. The operational overhead of maintaining full Kubernetes clusters — Karpenter tuning, add-on compatibility matrices, node OS patching, autoscaler configuration — is significant, and both managed modes take a meaningful slice of that off the operator’s plate.
GKE Autopilot has the more mature story. The pod-level billing model is genuinely compelling for variable workloads, the integration with Google’s managed observability stack (Cloud Logging, Cloud Monitoring, Managed Prometheus) is seamless, and the security enforcement (Restricted PSS by default) is something most teams should be doing anyway but often aren’t. The main friction points are the workload restrictions — teams migrating existing clusters need to audit for privileged containers and hostPath usage before switching.
EKS Auto Mode is the newer entrant, and it shows. The Karpenter restriction is the most significant limitation — removing spot instance support and the ability to customise consolidation behaviour essentially caps the cluster’s cost efficiency ceiling. For teams who have invested in Karpenter tuning, moving to Auto Mode is a step backward in cost optimisation capability. For teams who haven’t touched Karpenter configuration and just want something that works, Auto Mode delivers on that promise.
Standard modes remain essential for:
- Specific hardware requirements especially around custom sizing or GPU Workloads at scale
- Custom CNI requirements (Cilium, BGP routing, etc.)
- Compliance environments with specific node OS or access requirements
The recent improvements from Auto Kubernetes Offerings:
If you’re starting fresh with mostly stateless HTTP workloads and don’t have deep Kubernetes platform expertise, both managed modes are solid defaults. If you’re migrating an existing cluster or have ML/GPU workloads, audit the restriction lists carefully — the gap between “managed mode” and what your workloads actually need can be the underlying friction.
