GKE Autopilot Sucks

I am a big fan of GCP, and was quite excited to shift to GKE Autopilot. While I am pretty familiar with creating my own clusters with Kubespray or using off-the-shelf GCP with Managed Node Pools, I wanted Autopilot, because the workload that I was dealing with had a fluctation in deployments and demands, and I wanted the costing to reflect as per that, not the Peak Node Pool configuration.

Yes I know that in Node Pools we can have Surge Values, but GKE here also tends to over-allocate a lot more than needed, so I wanted to be as atomic as I could.

Cloud Logging is optional for Standard, but not for GKE Autopilot. Notice how System is enabled here by default as well.

However one big caveat is that using GKE Autopilot means that we also must use GCP Logging & Monitoring as well. In fact, you may turn off your own logs, but the System Logs, are forcibly enabled. This seemed innocent at first, but then after a couple of GKE releases, quickly became an issue.

Cloud logging billed 220$/day for 2 days!

This happened over a new year weekend, which was quite unfortunate, as it went unnoticed. By the time action was taken, we had been billed 400$+ extra than what should have been the bill.

After this fiasco, I shifted our cluster back to the original Standard Pool, with big nodes. Adjusting for over-provisioning, it was still more cost-effective than this. It is sad that this might have been more environment friendly choice, but how the pricing and services are tied together, makes this a worse experience. My takeaway personally from this is the reinforcement of belief to stay away from managed kubernetes as it is not really required in most use cases.