FinOps · 2026-07-03

Reducing cloud costs on AWS, Azure and GCP: what actually works

Most cloud cost problems are not exotic. Across AWS, Azure and GCP, the same handful of patterns produce most of the waste: compute sized for a launch-day estimate nobody revisited, commitment discounts left unused, storage that grows forever, and a bill nobody can attribute to a team. The work of reducing them is unglamorous, sequenced and very repeatable.

Start with allocation, not savings

Before optimising anything, make the bill attributable. Enforce a small, mandatory tag or label set - team, service, environment - via policy (AWS tag policies and SCPs, Azure Policy, GCP organization policies), and route shared costs deliberately. The point is not accounting cosmetics: engineers only change behaviour when the number they see is theirs.

One week of tagging discipline typically reclassifies the majority of an untagged bill, and it converts every later optimisation from an argument into a dashboard.

Rightsizing and shutdown: the boring majority of savings

Utilisation data tells you which instances, databases and node pools are oversized. Every provider ships the tooling: AWS Compute Optimizer and Cost Explorer rightsizing recommendations, Azure Advisor, GCP recommender. The pattern that actually saves money is making rightsizing a recurring engineering task with an owner rather than a one-off exercise that decays within two quarters.

  • Downsize or terminate instances below meaningful utilisation after checking headroom with the owning team.
  • Schedule non-production environments off outside working hours. This alone routinely cuts a dev/staging bill sharply.
  • Hunt orphans: unattached volumes, idle load balancers, old snapshots, stopped-but-billed resources, forgotten NAT gateways.
  • Fix autoscaling floors: a minimum node count set during an incident two years ago is a permanent tax.

Commitment discounts, in the right order

Commitments - AWS Savings Plans and Reserved Instances, Azure Reservations and Savings Plans, GCP committed use discounts - offer large discounts against on-demand pricing in exchange for a one- or three-year commitment. The order matters: rightsize first, then commit, otherwise you lock in the waste.

Cover the stable baseline conservatively, keep coverage and utilisation under monthly review, and remember the provider-specific levers: Azure Hybrid Benefit if you hold Windows Server or SQL Server licences, GCP sustained-use discounts that apply automatically, and the difference between compute-flexible and instance-scoped commitments everywhere.

Storage and data transfer: the silent growers

Storage rarely spikes - it creeps. Lifecycle policies (S3 tiering and expiry, Azure Blob access tiers, GCS storage classes) turn the creep into a curve that bends down. Deleting redundant snapshots and stale logs is usually the fastest single win in an older account.

Data transfer deserves one focused review: cross-AZ and cross-region traffic, NAT gateway processing charges and egress patterns are invisible in architecture diagrams but very visible on the bill. Sometimes one service moved to the right zone pays for the whole exercise.

Kubernetes and managed services: where waste hides now

In estates that have adopted Kubernetes, the waste has often just moved a layer up. Requests and limits set defensively at launch mean nodes run at low real utilisation while the autoscaler happily adds more; the fix is measuring actual container usage and right-sizing requests, then letting bin-packing do its work. Spot or preemptible capacity for stateless and batch workloads is the single biggest lever most clusters have not pulled. Modern schedulers and node-pool mixes make the interruption risk manageable for the right workload classes.

Managed services deserve the same scrutiny as compute: database instances provisioned for a migration peak and never revisited, premium tiers where standard would do, high-availability configurations on non-production environments, and provisioned throughput (IOPS, RU/s, DTUs) far above observed demand. The provider's own utilisation metrics make these visible in an afternoon. They just need someone assigned to look.

The failure modes that eat savings

Three patterns undo more cost work than any technical mistake. First, savings without an owner: a cleanup succeeds, nobody owns the follow-up cadence, and eighteen months later the estate is back where it started. Second, optimising the unit price while ignoring the architecture: a workload that should be event-driven or scheduled will out-waste any discount you negotiate for keeping it always-on. Third, treating the engineering team as the enemy of the bill: cost work framed as an audit of engineers produces hiding; framed as giving teams their own numbers and headroom, it produces engagement.

The countermeasure to all three is the same: cost visibility per team, a recurring forum where the numbers are looked at, and the authority to act sitting with the people who see them.

Making it stick: FinOps as a habit

The difference between a cost project and a cost capability is cadence: a monthly review with the top movers, one owner per anomaly, unit metrics that engineering respects (cost per environment, per tenant, per request), and budget alerts wired to the teams that can act. Savings from a one-off cleanup decay; savings from a cadence compound.

FinOps & cloud cost optimisation

FAQ

How much can we realistically save?

It depends on how long the estate has grown unreviewed. There is no honest universal percentage. The reliable statement: an estate that has never had a rightsizing pass, commitment strategy or storage lifecycle review almost always contains material, quickly recoverable waste.

Should we buy one-year or three-year commitments?

Three-year rates are much deeper but assume architectural stability. A common pattern is a conservative three-year layer for the proven baseline and one-year or flexible commitments above it. Never commit to cover a peak.