The 5 Azure Managed Identity Mistakes I See in Every Client Environment
Stop making these common Azure managed identity mistakes — learn the real-world consequences of over-scoped RBAC, shared identities, and lifecycle misunderstandings with practical CLI fixes.
I've spent the last few years auditing, architecting, and fixing Azure environments for clients ranging from startups to enterprises. Managed identity is one of those features that Microsoft rightfully pushes hard — no secrets to rotate, no credentials in code, tight integration with Entra ID. Super important stuff.
But honestly? The implementation is where things fall apart. I keep seeing the same five mistakes. Over and over. In environments that have passed security reviews. In environments built by senior engineers.
Let me walk through each one, show you what goes wrong in production, and give you the CLI commands to fix it.
Mistake 1: Using User-Assigned Identity When System-Assigned Would Be Simpler and More Secure
What People Do Wrong
A team creates a user-assigned managed identity, assigns it to a single Azure Function, gives it permissions to one storage account, and moves on. They never share that identity with anything else. They never move it between resources.
So why did they make it user-assigned in the first place?
Why It Happens
The docs talk about user-assigned identities first in a lot of places. People read "you can reuse it across resources" and think that sounds like good architecture. Future-proofing. Flexibility.
What Actually Happens in Production
You now have an identity whose lifecycle is completely decoupled from the resource it serves. Delete the Function App? The identity and its RBAC assignments are still sitting there. Orphaned. Still has permissions. Nobody cleans it up because nobody remembers it exists.
I've seen client environments with 40+ orphaned user-assigned identities, each with active role assignments. That's 40 attack surfaces that serve zero purpose.
The Fix
If the identity only needs to belong to one resource, use system-assigned. It gets created with the resource and deleted with the resource.
# Enable system-assigned identity on an Azure Function
az functionapp identity assign \
--name my-function-app \
--resource-group rg-production
# The output gives you the principalId — use it for RBAC
# {
# "principalId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
# "tenantId": "...",
# "type": "SystemAssigned"
# }Compare that to the user-assigned approach:
# Creating a user-assigned identity (only do this if you need it on multiple resources)
az identity create \
--name id-shared-storage-reader \
--resource-group rg-identity
# Assigning it to a resource
az functionapp identity assign \
--name my-function-app \
--resource-group rg-production \
--identities /subscriptions/<sub-id>/resourceGroups/rg-identity/providers/Microsoft.ManagedIdentity/userAssignedIdentities/id-shared-storage-readerDocs vs Reality: The docs present both types as equally valid starting points. In reality, system-assigned should be your default. Only reach for user-assigned when you have a concrete reason — like multiple resources needing the same identity.
Mistake 2: Sharing One Managed Identity Across Resources That Shouldn't Trust Each Other
What People Do Wrong
A team creates one user-assigned managed identity and assigns it to their API app, their background processor, their admin portal, and their reporting service. One identity to rule them all. Sounds efficient, right?
Why It Happens
Fewer identities means fewer role assignments to manage. It feels cleaner. "They all need access to the same storage account anyway."
What Actually Happens in Production
The blast radius explodes. If the background processor gets compromised — maybe through a deserialization vulnerability in a queue message — the attacker now has every permission you gave to the API, the admin portal, and the reporting service. Because they're all the same identity.
I've seen this break in client environments where a low-priority batch job shared an identity with a customer-facing API. The batch job needed write access to a blob container. The API only needed read. But because they shared an identity, the API effectively had write access too. A single SSRF vulnerability could have turned into data modification.
The Fix
One identity per trust boundary. Resources that serve different purposes or have different security profiles get their own identity.
# Create separate identities for separate trust boundaries
az identity create --name id-api-prod --resource-group rg-identity
az identity create --name id-batch-prod --resource-group rg-identity
az identity create --name id-admin-prod --resource-group rg-identity
# Assign minimal permissions to each
# API: read-only to blob storage
az role assignment create \
--assignee-object-id $(az identity show --name id-api-prod --resource-group rg-identity --query principalId -o tsv) \
--assignee-principal-type ServicePrincipal \
--role "Storage Blob Data Reader" \
--scope /subscriptions/<sub-id>/resourceGroups/rg-data/providers/Microsoft.Storage/storageAccounts/stproddata
# Batch: write access to a specific container (using custom role — see Mistake 3)
az role assignment create \
--assignee-object-id $(az identity show --name id-batch-prod --resource-group rg-identity --query principalId -o tsv) \
--assignee-principal-type ServicePrincipal \
--role "Storage Blob Data Contributor" \
--scope /subscriptions/<sub-id>/resourceGroups/rg-data/providers/Microsoft.Storage/storageAccounts/stproddata/blobServices/default/containers/batch-outputNotice the scope difference. The API gets read on the whole storage account. The batch job gets write on a single container. Big difference.
Mistake 3: Over-Scoping RBAC — Giving Contributor When a Custom Role With 3 Permissions Would Do
What People Do Wrong
Someone needs their managed identity to read blobs from a storage account. They assign the Contributor role at the subscription level. Done.
That identity can now create VMs, delete databases, modify networking rules, and deploy whatever it wants. For reading blobs.
Why It Happens
Built-in roles are easy. Contributor is the "it just works" role. Custom roles feel like overhead. And honestly, the Azure portal makes it really tempting — there's a big dropdown, you pick Contributor, you move on with your day.
What Actually Happens in Production
You've given a single application identity the keys to your entire subscription. If that identity gets abused — through a code vulnerability, token theft, or a misconfigured endpoint — the attacker has Contributor access to everything.
I've audited environments where 15+ managed identities all had Contributor at subscription scope. The actual permissions each one needed? Typically 3-5 specific actions on 1-2 resources.
The Fix
Step 1: Figure out the minimum permissions you actually need.
# List what permissions a built-in role actually grants
az role definition list --name "Storage Blob Data Contributor" --output json \
| jq '.[0].permissions[0].actions'Step 2: Create a custom role with only those permissions.
# Create a custom role definition
az role definition create --role-definition '{
"Name": "Blob Writer - Batch Output Only",
"Description": "Can write blobs to the batch-output container only",
"Actions": [],
"NotActions": [],
"DataActions": [
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/add/action",
"Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read"
],
"NotDataActions": [],
"AssignableScopes": [
"/subscriptions/<sub-id>/resourceGroups/rg-data"
]
}'Step 3: Assign the custom role at the narrowest possible scope.
# Assign to the managed identity at container scope
az role assignment create \
--assignee-object-id $(az identity show --name id-batch-prod --resource-group rg-identity --query principalId -o tsv) \
--assignee-principal-type ServicePrincipal \
--role "Blob Writer - Batch Output Only" \
--scope /subscriptions/<sub-id>/resourceGroups/rg-data/providers/Microsoft.Storage/storageAccounts/stproddata/blobServices/default/containers/batch-outputPro tip: Use az role assignment list --assignee <principal-id> --all to audit what a managed identity can actually do. Run this regularly. You'll be surprised.
# Audit all role assignments for a managed identity
az role assignment list \
--assignee a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
--all \
--output tableMistake 4: Assuming Managed Identity Works Everywhere
What People Do Wrong
A team designs their entire authentication architecture around managed identity. CI/CD pipelines, on-prem agents, third-party integrations, multi-cloud workloads — everything will use managed identity. Clean. No secrets anywhere.
Then they start building and realize half their workloads can't use it.
Why It Happens
Microsoft's marketing (and honestly, their docs) push managed identity as the answer to service authentication. And for workloads running inside Azure, it mostly is. But the moment you step outside the Azure compute boundary, it doesn't work.
What Actually Happens in Production
Teams hit walls and start improvising. I've seen GitHub Actions workflows using hardcoded service principal secrets in repository settings because someone assumed managed identity would "just work" with external CI/CD. Azure DevOps self-hosted agents running on-prem can't get managed identity tokens from IMDS. Third-party SaaS platforms integrating with Azure APIs need service principals with certificates.
The worst case I've seen: a team spent two weeks trying to make managed identity work in a Kubernetes cluster running in AWS that needed to talk to Azure Key Vault. The answer was workload identity federation with a service principal — but they burned days because the architecture diagram said "managed identity everywhere."
The Fix
Know where managed identity works and where it doesn't.
Where managed identity works:
- Azure VMs and VM Scale Sets
- Azure App Service and Function Apps
- Azure Container Apps and Azure Container Instances
- Azure Kubernetes Service (AKS) via workload identity
- Azure Logic Apps, Data Factory, API Management
- Azure Arc-enabled servers (extends to on-prem/multi-cloud)
Where you still need service principals or workload identity federation:
- GitHub Actions (use OIDC federation with
azure/loginaction) - On-premises servers (without Azure Arc)
- Multi-cloud workloads (AWS, GCP compute)
- Third-party SaaS integrations
- Azure Cloud Services (classic) — no managed identity support at all
- Azure DevOps self-hosted agents not running on Azure compute
# For GitHub Actions: set up OIDC federation instead of secrets
# Step 1: Create an app registration and service principal
az ad app create --display-name "github-actions-deploy"
APP_ID=$(az ad app list --display-name "github-actions-deploy" --query "[0].appId" -o tsv)
az ad sp create --id $APP_ID
# Step 2: Add federated credential for GitHub OIDC
az ad app federated-credential create --id $APP_ID --parameters '{
"name": "github-actions-main",
"issuer": "https://token.actions.githubusercontent.com",
"subject": "repo:your-org/your-repo:ref:refs/heads/main",
"audiences": ["api://AzureADTokenExchange"]
}'
# Step 3: Assign roles to the service principal (not the app)
SP_OBJECT_ID=$(az ad sp show --id $APP_ID --query id -o tsv)
az role assignment create \
--assignee-object-id $SP_OBJECT_ID \
--assignee-principal-type ServicePrincipal \
--role "Contributor" \
--scope /subscriptions/<sub-id>/resourceGroups/rg-deployDocs vs Reality: The docs say "use managed identity when possible." True. But they don't always make it obvious where the boundary is. If your workload runs outside Azure compute, stop and evaluate before assuming managed identity will work.
Mistake 5: Not Understanding System-Assigned vs User-Assigned Lifecycle Behavior
What People Do Wrong
Teams use managed identities without understanding what happens when resources are deleted, moved, or recreated. They get surprised when permissions disappear or when orphaned identities pile up.
Why It Happens
The lifecycle behavior is documented, but it's buried. People learn "system = one resource, user = many resources" and stop there. The critical difference is about what happens during the lifecycle of the resource, not just during creation.
What Actually Happens in Production
Scenario 1: A team uses system-assigned identity on a VM. They delete and recreate the VM for maintenance. The new VM gets a new system-assigned identity — new principal ID. All the old RBAC assignments? Gone. The app starts throwing 403s and nobody knows why.
Scenario 2: A team uses user-assigned identity everywhere. They decommission services but never delete the identities. Six months later, there are dozens of identities with active permissions that aren't assigned to any resource.
The Fix
Understand the lifecycle contract:
| Behavior | System-Assigned | User-Assigned |
|---|---|---|
| Created | When the resource is created (or identity enabled) | Independently, before resource creation |
| Deleted | When the resource is deleted | Manually, independent of resources |
| Shared across resources | No — 1:1 relationship | Yes — 1:many relationship |
| Resource recreation | New identity, new principal ID, old RBAC lost | Same identity survives, RBAC intact |
| Orphan risk | None — dies with the resource | High — must be manually cleaned up |
| RBAC management | Tied to resource lifecycle | Decoupled — must manage separately |
For system-assigned, automate the RBAC re-assignment:
# Script to re-assign roles after VM recreation
RESOURCE_GROUP="rg-production"
VM_NAME="vm-api-server"
# Get the new principal ID after recreation
PRINCIPAL_ID=$(az vm identity show \
--resource-group $RESOURCE_GROUP \
--name $VM_NAME \
--query principalId -o tsv)
# Re-assign the required roles
az role assignment create \
--assignee-object-id $PRINCIPAL_ID \
--assignee-principal-type ServicePrincipal \
--role "Storage Blob Data Reader" \
--scope /subscriptions/<sub-id>/resourceGroups/rg-data/providers/Microsoft.Storage/storageAccounts/stproddata
az role assignment create \
--assignee-object-id $PRINCIPAL_ID \
--assignee-principal-type ServicePrincipal \
--role "Key Vault Secrets User" \
--scope /subscriptions/<sub-id>/resourceGroups/rg-security/providers/Microsoft.KeyVault/vaults/kv-prodFor user-assigned, run regular cleanup audits:
# Find all user-assigned identities
az identity list --query "[].{Name:name, ResourceGroup:resourceGroup, PrincipalId:principalId}" -o table
# Check if an identity is actually assigned to any resource
# (If this returns empty, the identity is orphaned)
az resource list --query "[?identity.userAssignedIdentities.\
'/subscriptions/<sub-id>/resourceGroups/rg-identity/providers/\
Microsoft.ManagedIdentity/userAssignedIdentities/id-batch-prod' != null].name" -o tsvPro tip: If you're using Infrastructure as Code (Terraform, Bicep), system-assigned identities are trickier because you have a chicken-and-egg problem — you need the resource to exist to get the principal ID, but you need the principal ID to create role assignments. Use depends_on in Terraform or nested deployments in Bicep to handle this. User-assigned identities avoid this problem entirely since you create them first.
The Decision Framework: Which Identity Type Should You Use?
Stop overthinking it. Here's the decision tree I use with every client:
Use System-Assigned when:
- The identity serves exactly one resource
- You want automatic cleanup when the resource is deleted
- The resource won't be frequently deleted and recreated
- You're managing infrastructure manually or with simple IaC
Use User-Assigned when:
- Multiple resources need the same set of permissions
- Resources are frequently deleted and recreated (e.g., ephemeral VMs, scale sets)
- You need to pre-configure RBAC before the target resource exists
- You want to separate identity management from resource management
Use a Service Principal when:
- The workload runs outside Azure (CI/CD, on-prem, multi-cloud)
- The service genuinely doesn't support managed identity
- You need certificate-based authentication for regulatory compliance
- Prefer OIDC federation over client secrets whenever possible
# Quick audit: see all managed identities and their assignments in your subscription
echo "=== System-Assigned Identities ==="
az resource list --query "[?identity.type=='SystemAssigned'].{Name:name, Type:type, PrincipalId:identity.principalId}" -o table
echo "=== User-Assigned Identities ==="
az identity list -o table
echo "=== All Role Assignments for Managed Identities ==="
az role assignment list --all \
--query "[?principalType=='ServicePrincipal'].{Principal:principalId, Role:roleDefinitionName, Scope:scope}" \
-o tableWrapping Up
Managed identity is genuinely one of the best security features in Azure. No secrets. No rotation schedules. Tight Entra ID integration. But the implementation details matter.
The truth is, most of these mistakes come from reasonable-sounding decisions. "Use one identity for simplicity." "Give Contributor so it works." "Managed identity everywhere." They all sound right until production teaches you otherwise.
Audit your environment. Check your scoping. Understand your lifecycle behavior. And stop giving Contributor at subscription scope. Please.
Got questions about managed identity in your environment? Drop them in the comments — I read every one.
Read Next
Implementing Conditional Access for Azure Virtual Desktop
A step-by-step guide to securing your AVD environment with Conditional Access policies that actually make sense.
AWS IAM for Azure Admins: What Confused Me and What Finally Clicked
A practical breakdown of AWS IAM from someone who thinks in Azure RBAC — policies, roles, principals, and the mental model shifts that actually matter.