Platform · 2026-07-03
Cloud landing zone best practices for 2026
A landing zone is the pre-built foundation a workload lands in: account structure, identity, network, guardrails and logging, decided once and enforced everywhere. Teams that skip it do not avoid the decisions: they make them implicitly, one console click at a time, and excavate the result years later. These are the practices that hold up in regulated environments in 2026.
Multi-account structure is non-negotiable
The account (AWS), subscription (Azure) or project (GCP) is the strongest isolation boundary the provider gives you: blast radius, IAM scope and cost boundary in one. A workable baseline: management, security/audit, log-archive and shared-services accounts, then separate production and non-production accounts per workload, organised into organizational units or management groups that mirror how policy differs rather than the org chart.
Resist both extremes. One account 'to keep it simple' is how governance dies; an account per microservice is operational overhead without a matching risk boundary.
Identity before anything else
Every long-lived credential in a CI pipeline or laptop is a breach waiting for a leak. Federate humans through SSO with MFA into short-lived sessions (IAM Identity Center, Entra ID, Google Cloud Identity), and use workload identity federation (OIDC) for CI/CD instead of stored keys. Define a small set of permission roles per account tier rather than bespoke policies per person.
Create break-glass access with hardware-token protection and alerting on use, then verify the alert actually fires. An unmonitored break-glass account is just a backdoor with paperwork.
Guardrails: preventive first, detective always
Preventive controls (AWS SCPs, Azure Policy deny effects, GCP organization policy constraints) stop entire failure classes: deploying outside approved regions, disabling audit logging, making storage public, deleting the log archive. Keep the preventive set small and non-negotiable: it defines what is impossible, not what is discouraged.
Detective controls (AWS Config, Security Hub, Azure Defender for Cloud, GCP Security Command Center) catch the rest. The practice that matters: every detective finding has an owner and a routing path. A compliance dashboard nobody triages is decoration.
Network and logging defaults
Hub-and-spoke remains the sane default: centralized egress and inspection, non-overlapping IP space planned up front (painful to retrofit), private connectivity to provider services by default, and DNS resolution decided centrally. Most workloads never need a public subnet: make private the default and public the exception that requires review.
Send all control-plane and platform logs - CloudTrail, activity logs, audit logs, flow logs where proportionate - to the log-archive account with retention matching your regulatory requirements and object-lock or immutability enabled. Centralized, tamper-resistant logging is the single artefact every framework from ISO 27001 to DORA asks about.
Cost governance belongs in the foundation
The cheapest moment to enforce cost allocation is account creation. Bake the mandatory tag or label set into the vending process so an account cannot exist without an owner, a team and a cost centre; attach a default budget with alerts wired to that owner from day one. Retrofitting allocation onto a grown estate is a quarter of archaeology - enforcing it in the landing zone is a form field.
The same applies to the guardrails with direct cost impact: restricting expensive instance families and regions in non-production, requiring lifecycle policies on storage by default, and blocking the resource types nobody should be creating manually. None of this constrains a legitimate workload - it constrains the accidents.
A realistic build order for the first 30 days
Landing zone projects stall when they try to design everything before building anything. A sequence that works in practice:
- Days 1-5: organisation structure, management and log-archive accounts, SSO federation and break-glass: the irreversible decisions, made deliberately.
- Days 6-12: baseline Terraform modules (logging, IAM roles, network skeleton), state backend and CI pipeline for the platform repo itself.
- Days 13-20: preventive guardrail set (regions, public storage, audit-log protection), centralized log delivery verified end-to-end.
- Days 21-30: account vending automated, first real workload account created through it, detective controls routed to a triage owner.
- After day 30: migrate workloads in waves by coupling, never big-bang; each wave hardens the modules for the next.
Everything in Terraform, accounts included
The landing zone itself must be code: org structure, policies, network, IAM roles, logging - reviewed in pull requests, applied by pipeline, drift-detected. Account creation should be a vending process (a module call plus an approval), so a new team gets a compliant account in hours with baselines pre-applied.
Two patterns pay for themselves: a small set of versioned baseline modules every account consumes, and state isolation per account so one bad apply cannot take out the estate. If migration onto the landing zone is part of the job, plan workload moves per wave: foundation first, then workloads in order of coupling.
FAQ
Should we use the provider's own landing zone product?
AWS Control Tower, Azure landing zone accelerators and Google's setup tooling are reasonable starting points, but treat them as scaffolding. You still own guardrail content, identity design and everything they generate; keeping it in Terraform keeps it reviewable.
We already have 40 resources in one account. Too late?
No. This is the most common starting point. Build the landing zone beside the legacy account, move workloads in planned waves, and keep the old account as a quarantined tenant until it is empty. Retrofitting in place is usually slower than migrating out.