NHI Governance in Identity and Access Management

Sumit Kumar TiwariS

Sumit Kumar Tiwari

1 min read2,073 words

NHI Governance in Identity and Access Management

Introduction

For most of IAM's history, the word "identity" implicitly meant a person — an employee, a contractor, a customer. That assumption no longer holds. Modern enterprise environments are dominated by Non-Human Identities (NHIs): service accounts, API keys, OAuth tokens, machine certificates, cloud workload identities, secrets used by CI/CD pipelines, bots, and increasingly, autonomous AI agents.

Industry analysts consistently estimate that NHIs now outnumber human identities by ratios of 40:1 to 100:1 in cloud-native environments. Yet most organizations apply only a fraction of the governance rigor to NHIs that they routinely apply to humans. The result is a sprawling, under-managed identity surface that has become the leading vector in recent high-profile breaches — from the Okta support-system compromise to the Microsoft Midnight Blizzard incident, both rooted in poorly governed non-human credentials.

This article lays out what NHI governance is, why it's hard, and how to build a program that actually works.


1. What Counts as a Non-Human Identity?

NHIs are any digital identity that authenticates and acts without a human at the keyboard. They fall into several broad categories:

| Category | Examples | |---|---| | Service accounts | Linux/AD service accounts, Kubernetes service accounts, GCP/Azure/AWS service principals | | Machine credentials | API keys, OAuth client credentials, PATs (personal access tokens used by automation), SSH keys | | Workload identities | AWS IAM roles assumed by EC2/Lambda, Azure managed identities, GCP workload identity federation, SPIFFE/SPIRE SVIDs | | Certificates | mTLS certs, code-signing certs, X.509 client certs | | Secrets | Database connection strings, encryption keys, webhook signing secrets | | Bots & RPA | UiPath/Automation Anywhere bots, Slack/Teams bots, scripted automations | | AI agents | LLM-powered agents that call tools, MCP servers, autonomous coding agents |

The taxonomy matters because each class has different lifecycle, ownership, and risk characteristics. A short-lived AWS IAM role assumed via STS is a fundamentally different governance problem than a 10-year-old service account with a static password sitting in Active Directory.


2. Why NHI Governance Is Harder Than Human IAM

Human IAM benefits from a few structural advantages that don't exist for NHIs:

  1. Humans have HR records. Joiners, movers, and leavers are tracked by an authoritative source of truth (HRIS). NHIs have no equivalent — they're created ad-hoc by developers, platform teams, vendors, or other automation.

  2. Humans authenticate interactively. MFA, step-up auth, and behavioral signals are all available. NHIs typically present a static secret or a signed assertion — and that's it.

  3. Humans have accountability. A person owns their account. NHIs are often orphaned: the engineer who created them has left the company, the project is sunset, but the credential lives on with production access.

  4. Humans rotate. Passwords expire, sessions end, employment terminates. NHI secrets famously don't rotate — surveys consistently find that 50%+ of service account credentials in enterprises are over a year old, and a meaningful fraction are over five years old.

  5. NHIs are over-privileged by default. When provisioning under time pressure, the path of least resistance is to grant broad permissions ("just give it admin so it works"). Privilege review almost never happens.

  6. Visibility is fragmented. A typical enterprise has NHIs scattered across AD, Okta, AWS IAM, Azure AD, GCP, GitHub, GitLab, HashiCorp Vault, Kubernetes, Snowflake, Salesforce, and dozens of SaaS apps. No single pane of glass exists out of the box.


3. The NHI Lifecycle

Effective governance treats NHIs the way mature IAM treats humans: as a lifecycle, not a point-in-time grant. The stages:

3.1 Discovery & Inventory

You cannot govern what you cannot see. The first job is to enumerate every NHI across every system and link it to:

  • Owner (a named human or team — never a distribution list)

  • Purpose (what business function it serves)

  • Consuming workload (which application/pipeline/script uses it)

  • Permissions (what it can do)

  • Last-used timestamp (the single most useful signal for cleanup)

Discovery is continuous, not one-time. New NHIs appear daily; the inventory must be living.

3.2 Provisioning & Onboarding

NHIs should be created through a controlled path:

  • Request-based: a ticket or self-service portal, with required justification

  • Owner-attributed: a real person on the hook for the identity

  • Scoped at creation: least privilege baked in from day one, not added later

  • Tagged: with environment, application, cost center, and data-classification labels

Anti-pattern: developers minting their own PATs or service principals with console clicks and no audit trail.

3.3 Authentication & Credential Management

Where possible, eliminate static secrets entirely:

  • Workload identity federation (AWS STS, Azure Managed Identity, GCP Workload Identity, OIDC trust from GitHub Actions)

  • Short-lived certificates via SPIFFE/SPIRE or Vault PKI

  • mTLS with automated cert rotation

Where static secrets are unavoidable:

  • Store in a centralized secrets manager (Vault, AWS Secrets Manager, Azure Key Vault, GCP Secret Manager)

  • Rotate on a defined cadence (30/60/90 days depending on risk)

  • Never commit to source control — enforce with pre-commit hooks and secret scanning

3.4 Authorization & Least Privilege

NHIs are the single biggest source of privilege creep in cloud environments. Governance requires:

  • Just-enough access: only the permissions actually needed

  • Just-in-time access: elevation on demand for sensitive operations, not standing privilege

  • Conditional access: bind tokens to source IP, source workload identity, time window, and resource ARN — not just to the secret value

  • Permission boundaries / SCPs: cap what an NHI could ever be granted, even if a misconfigured policy slips through

3.5 Monitoring & Anomaly Detection

NHIs should have continuous behavioral baselines. Anomalies worth alerting on:

  • Authentication from a new geography or ASN

  • Spike in API call volume

  • Use of an API never used before by this identity

  • Access outside the workload's normal time window

  • Credential use from outside the workload's expected network

Tooling: CIEM (Cloud Infrastructure Entitlement Management) products — Sonrai, Permiso, Astrix, Oasis, Entro, Clutch, Aembit — sit in this layer.

3.6 Periodic Access Review

Every NHI should be reviewed on a cadence proportional to its risk:

  • High-risk (production data access, admin scope): quarterly

  • Medium-risk: semi-annually

  • Low-risk (read-only dev): annually

Reviews must be assigned to the owner, not to a generic security mailbox. If the owner cannot justify continued access, the identity is revoked. The "stale last-used" signal makes this review largely automatable: anything unused for 90+ days is a presumptive deletion candidate.

3.7 Decommissioning

The hardest stage, because nobody wants to be the person who breaks production by deleting the wrong service account. Best practices:

  • Disable before delete: deactivate first, watch for failures for 14–30 days, then delete

  • Sunset-on-leave: when a human leaves, audit every NHI they own and reassign or revoke

  • Project-end sweep: when a project is sunset, its NHIs are sunset with it

  • Hard rotate on suspicion: if compromise is suspected, rotate immediately and investigate after


4. The Five Pillars of an NHI Governance Program

To operationalize the lifecycle above, organizations typically organize their program around five pillars:

Pillar 1 — Visibility

A unified inventory of NHIs across every identity provider, cloud platform, SaaS app, and code repository. This is the foundation; everything else fails without it.

Pillar 2 — Ownership

Every NHI has a named human owner and a named business application. Ownership is recorded, periodically re-affirmed, and re-assigned on termination.

Pillar 3 — Least Privilege

Permissions are scoped narrowly, granted just-in-time where possible, and continuously right-sized based on actual usage telemetry.

Pillar 4 — Lifecycle Automation

Creation, rotation, review, and decommissioning are all driven by automated workflows tied to authoritative sources (HRIS for owner status, CMDB for application status, code repo for workload existence).

Pillar 5 — Threat Detection & Response

Behavioral baselines, anomaly detection, and an incident-response playbook specific to compromised NHIs — including rapid rotation, blast-radius mapping, and downstream credential propagation.


5. The AI Agent Wrinkle

The rise of autonomous AI agents — LLMs that call tools, write code, and act on behalf of users — adds a new and uncomfortable dimension to NHI governance:

  • Agents are NHIs that reason. Unlike a script with fixed behavior, an agent may chain unexpected tool calls. Its blast radius is harder to predict.

  • Agents inherit credentials. When a developer hands an agent a GitHub token, an AWS key, and a database password to "go fix the bug," the agent now holds an identity bundle that no human review process has scrutinized.

  • Prompt injection becomes a credential-exfiltration vector. An agent that reads untrusted input may be coerced into exposing its own credentials.

  • Action-on-behalf-of-user semantics are unclear. When an agent acts, is the audit trail attributed to the agent's identity, the human who launched it, or both? Most systems today log only one — usually the wrong one.

Emerging patterns worth tracking:

  • MCP server scopes — narrowly scoped tools rather than blanket API access

  • Agent identity standards — early work on issuing distinct identities to agent instances, with delegated authority from a human principal

  • Tool-call audit logs — recording every action an agent took, with the prompt context that led to it

This is a frontier area. The right posture for governance teams is to treat AI agents as high-risk NHIs until the tooling matures: short-lived credentials, narrow scopes, mandatory logging, and human-in-the-loop for any production-mutating action.


6. Common Anti-Patterns

A short list of patterns to actively eliminate:

  • Shared service accounts used by multiple applications. Ownership is impossible to attribute; rotation breaks everyone.

  • Owner = ops-team@company.com. A distribution list is not an owner.

  • Static, never-rotating API keys in production environments.

  • Personal Access Tokens used for automation. When the employee leaves, the automation breaks — or worse, doesn't, and continues running with a credential that should have been revoked.

  • Permission-by-copy-paste. "Just give it the same permissions as that other service" — and now both have permissions neither needs.

  • Secrets in environment variables in CI/CD logs. A surprisingly common leak source.

  • Treating discovery as a one-time project. It must be continuous.


7. Metrics That Matter

A governance program is only as good as the metrics that drive it. The most useful ones:

| Metric | What it tells you | |---|---| | % of NHIs with a named human owner | Coverage of the ownership pillar | | % of NHIs with credentials rotated in the last 90 days | Hygiene of the credential layer | | # of NHIs with no usage in 90+ days | Cleanup backlog | | Mean time to revoke after owner termination | Effectiveness of the leaver workflow | | % of NHIs using federated/short-lived credentials vs. static | Maturity of the auth layer | | # of over-privileged NHIs (privilege granted but never used) | Right-sizing opportunity | | # of NHIs flagged for anomalous behavior in the last 30 days | Detection coverage |

Track these monthly, report quarterly to leadership, and use them to prioritize remediation.


8. Where to Start

For organizations beginning an NHI governance program, the practical sequence is:

  1. Inventory. Pick the three highest-risk environments (typically: production cloud, source-code platform, identity provider) and enumerate every NHI. Don't try to boil the ocean.

  2. Assign ownership. Every NHI gets a named owner. Anything you cannot attribute is a candidate for immediate disablement.

  3. Identify the worst offenders. Stale, over-privileged, or shared NHIs in production. Fix the top 20 before designing the perfect program for the next 2,000.

  4. Enforce a creation policy. All new NHIs must come through the controlled path. Stop the bleeding before draining the lake.

  5. Eliminate static secrets in greenfield work. New workloads use workload identity federation by default.

  6. Instrument the lifecycle. Automate review, rotation, and decommissioning workflows. Manual processes do not scale.

  7. Iterate. Expand coverage, tighten policy, and increase automation. Governance is a program, not a project.

nhi_governance_iam.png

Conclusion

NHI governance is no longer optional. The volume of non-human identities, their privilege levels, and the rate at which they're created have outstripped the manual processes that worked for human IAM. The breaches making headlines today are overwhelmingly NHI-driven — a stolen API key, a leaked service-account credential, an OAuth token that should have been revoked years ago.

The good news: the playbook is known. Visibility, ownership, least privilege, lifecycle automation, and threat detection — applied with the same discipline mature IAM programs already apply to humans — close the gap. The organizations that build this muscle now will spend the next decade ahead of the regulatory curve, the attacker curve, and the AI-agent curve. The ones that don't will keep paying the cost in incident reports.

NHIs are the silent majority of your identity estate. Treat them that way.

0 comments

Comments

Loading comments…

Sign in to leave a comment.