Behind the Gateway: Building a Centralised API Layer for a Mission-Critical Platform

Behind the Gateway: Building a Centralised API Layer for a Mission-Critical Platform

Twenty-seven web applications. All of them sitting directly behind application gateways. No centralised API management. No security controls at the API layer. No visibility into who was calling what, how often, or whether any of it was behaving the way it should.

That was the starting point when I got pulled into this engagement. A large organisation running mission-critical systems at national scale needed to modernise how their APIs were managed — or more accurately, they needed to start managing them at all.

The Problem Nobody Wanted to Talk About

Here's the thing about exposed APIs: they work. Traffic comes in, hits the application gateway, gets routed to the backend, and a response goes out. Everything looks fine. Until it isn't.

The 2025 State of API Security Report from Traceable AI and the Ponemon Institute found that 57% of organisations had suffered API-related breaches in the past two years. Of those, 73% experienced three or more incidents. These aren't edge cases. This is the norm.

And the detection story is even worse. Only 21% of organisations report a high ability to detect attacks at the API layer. That means nearly four out of five organisations are essentially flying blind when it comes to API-level threats.

The organisation I was working with wasn't breached. But they had no way of knowing if they had been. No rate limiting. No subscription management. No analytics on API usage patterns. No centralised policy enforcement. Every application was its own island, and the only thing standing between the public internet and the backend was an application gateway doing basic routing.

Why Azure APIM Was the Right Call

There were a few options on the table. Build custom middleware. Use a third-party API gateway. Or go with Azure API Management, which was already native to their cloud environment.

APIM won for three reasons. First, it integrated directly with their existing Azure infrastructure — VNets, application gateways, managed identities, Key Vault. No stitching together disparate systems. Second, it supported the private networking model they needed: all API traffic flowing through private endpoints, never touching the public internet after the initial gateway. Third, the policy engine is genuinely powerful. Rate limiting, CORS, security headers, JWT validation, URL rewriting, backend authentication via managed identity — all configurable per-API or globally, with inheritance that lets you set baseline policies and override where needed.

The target architecture was clean:

Internet → Application Gateway (WAF) → APIM (Private) → Backend Services (Private)

Every API request now passes through a centralised management layer that authenticates, authorises, rate-limits, logs, and routes. The backend services themselves are completely isolated — no public access, no direct exposure. If you're not coming through the gateway, you're not getting in.

Four Environments, Zero Downtime

One of the non-negotiable constraints was zero downtime during migration. This wasn't a greenfield deployment where you could take things offline for a weekend. These systems were live, serving real users, and any interruption was unacceptable.

The environment strategy split into production and non-production, with non-production covering development, test, and UAT on a shared APIM instance for cost efficiency, and production getting its own dedicated instance. Each environment maintained strict segregation — separate VNets, separate deployment pipelines, separate security boundaries.

The migration itself was phased. Start with non-production. Prove the architecture works end-to-end. Validate that traffic flows correctly through every layer. Then move to production with confidence, not hope.

Each phase followed the same pattern: deploy APIM infrastructure, configure VNet integration and private endpoints, set up NSG rules to lock down traffic paths, import APIs with their policies, validate connectivity, and test. Methodical. Boring. Exactly how infrastructure migration should be.

The Build: Policies, Private Endpoints, and Managed Identity

The real value of APIM isn't just routing traffic. It's what happens to that traffic along the way.

Every API got a layered policy stack. At the global level: correlation IDs for request tracing, standard security headers, baseline rate limiting. At the API level: specific authentication requirements, backend URL configuration, path rewriting. At the operation level: fine-grained controls where individual endpoints needed different treatment.

<!-- API-level policy: strip path prefix before forwarding to backend -->
<inbound>
    <base />
    <set-backend-service base-url="https://backend-service.azurewebsites.net" />
    <rewrite-uri template="/" />
    <rate-limit calls="100" renewal-period="60" />
</inbound>

The policy inheritance model takes some getting used to. The <base /> tag pulls in policies from the parent scope. Remove it, and you break inheritance — useful when you need an operation to skip API-level JWT validation, but easy to get wrong if you don't understand the hierarchy.

For backend services that required authentication, managed identity was the answer. APIM acquires a token using its system-assigned managed identity, attaches it to the outgoing request, and the backend validates it. No secrets stored in configuration. No API keys floating around in environment variables. The identity is tied to the APIM instance itself, and permissions are granted through role assignments — granular, auditable, and revocable.

Private endpoints were the final piece. Backend services had their public access disabled entirely. The only path in was through the VNet, through APIM, through the application gateway. Three layers of controlled access before a request touches application code.

What Got Tricky

Not everything was smooth.

The first surprise was DNS resolution in private networks. Custom domains that worked fine when accessed externally didn't resolve from within the APIM subnet. The fix was straightforward — use the Azure default domain for backend URLs instead of custom domains — but it cost a few hours of head-scratching before the penny dropped. When you're inside the VNet, you play by VNet rules.

The second challenge was policy inheritance. APIM's policy hierarchy runs Global → API → All Operations → Individual Operation. JWT validation configured at the API level executes before operation-level policies. If you need one specific operation to bypass authentication (say, a health check endpoint), you can't just add a skip at the operation level. You need to understand how <base /> works and selectively break inheritance. The documentation explains this, but the mental model only clicks once you've debugged it hands-on.

The third was coordination. This wasn't a solo effort. Backend teams needed to provide endpoint documentation. Infrastructure teams needed to configure application gateway routing rules. Identity teams needed to grant workspace permissions for managed identity access to downstream services. Every dependency was a potential blocker, and some of them were. The technical architecture was ready days before the permissions and configurations from other teams caught up.

By the Numbers

The engagement centralised 27 web applications behind a managed API layer across four environments, with private networking, subscription-based access control, and comprehensive monitoring.

  • Dev hours avoided: 320 hours — the equivalent of a traditional consulting engagement spanning discovery, architecture design, infrastructure deployment, policy development, testing, and documentation
  • Ongoing role replaced: 1 Cloud Architect / API Platform Engineer ($160,000/year based on 2025-2026 Australian market rates via SEEK and Glassdoor) who would otherwise manage API security policies, gateway routing, subscription management, and platform operations on a recurring basis
  • Annual operational savings: $160,000

The API Management market itself is projected to grow from $6.51 billion in 2025 to $30.81 billion by 2033, according to Polaris Market Research. Organisations aren't just adopting API management because it's trendy. They're adopting it because the cost of not managing APIs — in breaches, in lost visibility, in manual operations — is getting harder to ignore.

The Pattern Worth Remembering

If you're running any kind of platform with multiple backend services exposed to the internet, here's the question that matters: do you know what's happening at your API layer right now?

Not yesterday's logs. Not a monthly security report. Right now.

Centralised API management isn't about adding complexity. It's about replacing invisible risk with visible control. Every request authenticated. Every endpoint rate-limited. Every call logged and traceable. Every backend service hidden behind private networking where it belongs.

The organisation I worked with went from having no API-layer visibility across 27 applications to having a single pane of glass showing every request, every response time, every failed authentication attempt. The architecture didn't just improve their security posture — it gave them something they never had before: certainty about what was actually happening in their own infrastructure.

That's the real value behind the gateway. Not just blocking the bad stuff. Knowing what's there in the first place.

    Float Infinity Logo

    Powered by Float Infinity

    Privacy PolicyTerms of Service© 2025 Float Infinity