We spent six months building the “perfect” FinOps program at a previous company. Comprehensive tagging policy. Beautiful Grafana dashboards. A 14-page runbook for monthly cost reviews.
Nobody used any of it.
The dashboards went stale. The tagging policy had 40% compliance after three months. The monthly cost review became a meeting where one person from finance read numbers off a spreadsheet while engineers checked Slack on their phones.
Here’s what we learned after scrapping that approach and rebuilding it from scratch — this time starting with engineers instead of spreadsheets.
Cost Ownership Belongs to Engineers, Not Finance
The single biggest mistake in FinOps adoption is treating cloud costs as a finance problem. Finance can set budgets. Finance can flag anomalies. But finance cannot fix a Kubernetes cluster running at 30% utilization or an engineer’s decision to use a db.r6g.2xlarge when a db.r6g.large handles the load fine.
Cost ownership means the team that deploys it is responsible for what it costs.
This sounds obvious written down. In practice, most engineering orgs resist it because:
- Engineers were never trained to think about cost
- There’s no feedback loop — you deploy something and never see the bill
- Incentives are misaligned — shipping fast gets rewarded, spending efficiently doesn’t
To make cost ownership stick, you need three things:
Visibility at the team level. Every team should see their own cloud spend, broken down by service, updated daily. Not a monthly PDF from finance. A live dashboard they can check whenever they want. If an engineer can’t answer “how much does my team’s infrastructure cost per month?” within 60 seconds, you don’t have cost ownership.
Budgets with consequences. Not punitive consequences — informational ones. Each team gets a monthly budget based on their actual spend plus a reasonable growth margin. When they exceed it, the team lead gets a notification and is expected to explain why at the next review. Most of the time the answer is legitimate (“we onboarded 3 new customers”). Sometimes it surfaces waste nobody noticed.
Cost as a PR review consideration. The most effective FinOps teams we’ve seen treat cost implications the same way they treat security implications in code review. Not a hard gate, but a consideration. “This PR adds a new RDS instance — have we checked if the existing one has capacity?” becomes a normal question.
A Tagging Strategy That Actually Works
I’ve seen dozens of tagging strategies. The ones that fail share a common pattern: they’re designed by someone who doesn’t deploy infrastructure, ratified in a meeting, published on Confluence, and never enforced.
The ones that work have four properties:
Keep Required Tags to Five or Fewer
Every additional required tag reduces compliance. We’ve measured this. At 3 required tags, most teams maintain 90%+ compliance. At 8 required tags, compliance drops below 60% within two months.
Start here:
| Tag | Example | Purpose |
|---|---|---|
team | platform, payments, ml-infra | Cost allocation |
environment | prod, staging, dev | Environment-level cost tracking |
service | api-gateway, user-service | Service-level cost attribution |
managed-by | terraform, manual, pulumi | Identifies how to modify the resource |
cost-center | ENG-001, DATA-003 | Maps to finance’s chart of accounts |
That’s it. Five tags. Everything else is optional.
Enforce Tags at Deploy Time, Not After
Tagging compliance programs that rely on “please go back and tag your resources” are doomed. You need enforcement at the point of creation.
AWS: Use Service Control Policies (SCPs) to deny resource creation without required tags. Here’s the pattern:
{
"Effect": "Deny",
"Action": ["ec2:RunInstances", "rds:CreateDBInstance"],
"Resource": "*",
"Condition": {
"Null": {
"aws:RequestTag/team": "true",
"aws:RequestTag/environment": "true",
"aws:RequestTag/service": "true"
}
}
}
Azure: Use Azure Policy with deny effect for resources missing required tags. Apply it at the management group level so it covers every subscription.
GCP: Use Organization Policy constraints combined with a CI/CD check on Terraform plans. GCP’s native label enforcement is weaker than AWS/Azure, so you’ll rely more on pipeline-level checks.
Standardize Tag Values
production, prod, Production, PROD — if all four exist in your account, your cost allocation is broken. Enforce allowed values. SCPs and Azure Policy both support enumerated allowed values. For Terraform shops, use a shared module that validates tag values against a central list.
Audit Weekly, Not Monthly
Run an automated weekly report that shows untagged resources by team. Send it directly to team leads, not to a shared Slack channel where it gets buried. Our untagged resource count went from 340 to 12 within six weeks once we started sending personalized weekly reports to the responsible team lead.
Cost Visibility Dashboards: What to Show and Who Sees What
Most cost dashboards fail because they try to show everything to everyone. An executive doesn’t need to see per-pod Kubernetes costs. An engineer doesn’t need to see the company’s total cloud spend trend.
Executive Dashboard (CFO, VP Eng)
- Total cloud spend: monthly trend, 6-month rolling
- Spend by business unit or product line
- Month-over-month change with annotation (why did it go up?)
- Unit economics: cost per customer, cost per transaction, cost per 1,000 API calls
- Budget vs. actual by department
Keep it to one screen. No scrolling. If the CFO has to scroll, they won’t look at it.
Team Dashboard (Engineering Managers, Team Leads)
- Team’s total spend: daily granularity, 30-day view
- Top 5 services by cost within the team
- Cost anomalies (spikes or unexpected changes)
- Untagged resource count
- Week-over-week delta
This is the dashboard that drives action. It should update daily and be accessible without asking anyone for access.
Engineering Dashboard (Individual Contributors)
- Service-level cost breakdown for their team’s services
- Resource-level detail: which specific instances, databases, or clusters cost the most
- Idle resource detection: things running that might not need to be
- Recommendations: right-sizing suggestions, unused reservations
Engineers want specifics. Don’t give them pie charts. Give them a table sorted by cost with actionable recommendations next to each row.
The Dashboard Trap
Here’s the mistake we made: we built all three dashboards, announced them in an all-hands, and assumed people would look at them.
They didn’t.
Dashboards are passive. Alerts are active. The dashboard is the reference material. The alert is what drives behavior. Every dashboard should have corresponding alerts: budget threshold exceeded, anomaly detected, new untagged resources found. The alert sends you to the dashboard. The dashboard gives you context. Without the alert, the dashboard is just a pretty page nobody visits.
The Monthly Cost Review: Who, What, and How
A monthly cost review that works has a strict format. Here’s what we settled on after a year of iteration:
Who’s in the room:
- FinOps lead or whoever owns cost visibility (facilitator)
- One representative from each engineering team (rotating — not always the manager)
- Finance partner (for budget context, not to present)
- Optional: infrastructure/platform team lead
Duration: 45 minutes. Hard stop. If it goes longer, you’re covering too much.
Agenda:
-
Total spend summary (5 min) — Month-over-month change, budget vs. actual, any anomalies. Just the headline numbers.
-
Top 3 cost changes (15 min) — The three biggest absolute-dollar changes from last month. For each: what changed, why, and whether it’s expected. This is where 80% of the value is. A $4,000 increase in NAT Gateway costs tells you more than reviewing every line item.
-
Team spotlights (15 min) — Rotate through teams. Each team gets 5 minutes once every 2–3 months to present: what they spent, what they optimized, what they’re planning. This creates accountability and knowledge sharing. When one team shares how they cut $800/month by switching from NAT Gateway to VPC endpoints, other teams learn.
-
Action items (10 min) — Every meeting ends with named action items. “Platform team will evaluate VPC endpoints for the data pipeline subnets by March 30” is an action item. “We should look into cost optimization” is not.
What kills cost reviews:
- Making them longer than 45 minutes
- Presenting 50 slides of charts nobody asked for
- No action items — just information sharing with no follow-up
- Only finance presenting — engineers tune out
Where Should FinOps Sit in Your Org?
This depends on your company size, but here’s what we’ve seen work:
Under 100 engineers: Platform Team
You don’t need a dedicated FinOps person. Add cost visibility to the platform team’s responsibilities. They already own infrastructure tooling, so they’re the natural owners of cost dashboards, tagging enforcement, and optimization recommendations.
The risk: platform teams are busy. Cost work gets deprioritized when there’s a Kubernetes upgrade or a security incident. Mitigate this by making cost review a standing meeting, not a “when we have time” activity.
100–500 engineers: Dedicated FinOps Engineer (Embedded in Platform)
One person whose primary job is cloud cost optimization, sitting on the platform team. They own dashboards, run the monthly review, partner with finance on budgeting, and work with individual teams on optimization projects.
This person should be an engineer, not a finance analyst. They need to understand Kubernetes resource requests, database indexing strategies, and CDN caching patterns to have credible conversations with engineering teams.
500+ engineers: FinOps Team
At this scale you need a team: a FinOps lead, 2–3 FinOps engineers, and ideally a data analyst. They operate as an internal consultancy — each FinOps engineer is embedded with a set of product teams, understands their architecture, and proactively identifies optimization opportunities.
The team reports to the VP of Engineering or CTO, not to Finance. Reporting to Finance creates a dynamic where FinOps is seen as cost police rather than engineering partners. The best FinOps teams feel like they’re on your side, not auditing you.
Common Mistakes We Made (So You Don’t Have To)
Mistake 1: Tagging everything but not acting on the data. We achieved 95% tagging compliance and felt great about it. Then we realized nobody was using the tags to make decisions. Tags without dashboards and dashboards without alerts are just metadata.
Mistake 2: Optimizing before understanding. We jumped straight to buying Reserved Instances because “that’s what you’re supposed to do.” We hadn’t even baselined our spend or understood our usage patterns. We ended up with reservations for instance types we stopped using four months later.
Mistake 3: Making cost optimization a quarterly project. Cost optimization isn’t a project with a start and end date. It’s a practice. The moment you stop paying attention, costs drift back up. We saw a 15% cost increase within two months of “finishing” our optimization project.
Mistake 4: Not connecting cost to business metrics. “We spent $47,000 on AWS this month” means nothing without context. “We spent $47,000 on AWS this month, which is $0.12 per active user, down from $0.15 last quarter” tells a story. Unit economics make cost conversations meaningful for executives and engineers alike.
Mistake 5: Treating all environments equally. Your production environment needs reliability-focused decisions. Your dev and staging environments need cost-focused decisions. We were running staging at the same scale as production “just in case.” Cutting staging to 25% of prod capacity saved $6,200/month with zero impact on developer productivity.
Getting Started This Week
If you’re starting from zero, don’t try to build the whole program at once. Start here:
-
This week: Enable AWS Cost Explorer (or equivalent) and identify your top 5 cost line items. Just know what you’re spending.
-
Next week: Implement 3 required tags and enforce them via SCP or Azure Policy for new resources. Don’t try to backfill yet.
-
Week 3: Build one team-level dashboard showing daily spend by service. Send it to one team lead and ask if it’s useful.
-
Week 4: Run your first cost review meeting. Keep it to 30 minutes. Focus on the top 3 cost changes from the previous month.
That’s four weeks to a functioning FinOps practice. It won’t be perfect. It doesn’t need to be. It needs to create a feedback loop between engineering decisions and their cost impact.
Everything else — comprehensive tagging, executive dashboards, unit economics, optimization projects — builds on that foundation.
Tools like Xplorr can accelerate this by giving you multi-cloud cost visibility, anomaly detection, and team-level dashboards out of the box. But the tool doesn’t matter if the practice isn’t there. Get the practice right first. The tooling follows.
Keep reading
- Cloud Cost Optimization for Startups: From $18K to $7K/Month
- Cloud Cost Monitoring and Alerting: A Setup Guide
- AWS Cost Optimization Strategies That Actually Work
See how Xplorr helps → Features
Xplorr finds an average of 23% in unnecessary cloud spend. Get started free.
Share this article