Start Tracking Your Unit Costs to Avoid Surprise Cloud Bills

Two key practices for managing cloud infrastructure costs, with real examples from building and scaling an advertising exchange

Dec 26, 2024

The current crop of cloud infrastructure is really good at a couple things: getting you up and running quickly, and getting you to spend a lot of money.

It’s a great boost to productivity when you can get your AWS or Snowflake instance humming in minutes, but it’s easy to forget that there are actual computers – with nebulous pricing and processes – behind the scenes. The costs of managing data and infrastructure has left many a finance team scratching their heads.

There are horror stories of surprise bill (a $90,000 AWS bill, an $8,000 Snowflake charge) that were dramatically higher than people expected. It’s easy to blame the data engineer or team who incurred the cost, but part of the issue is the design of the tools themselves. They’re not incentivized to safeguard your organization against wasteful spending—but there are a couple best practices you can implement to prevent it.

There are two things you should to get ahead of any unpleasant surprises:

Tag everything. What is the environment of this resource? What is the service or application that is consuming the resource? If it’s a query which API or users are triggering the request?
Track your unit costs. Most services you run should be correlated to some unit of your business. If you run an ecommerce business, your unit cost may be orders or customers. If it’s video streaming, your costs are going to be correlated with hours streamed and active users. If it’s healthtech, it might be a function of patient records or generated reports.

The key is to understand what’s the primary driver of your service costs and start tracking the unit costs over time. As things get more complex, you’ll also need to have different unit costs depending on the service. You can break this down into units for different systems. The ultimate goal is to get to apples-to-apples comparisons on a recurring basis, to see if there are any significant changes. If anything looks off, you can then use the tagged costs to dig into those items.

At TripleLift, we refined a cost tracking system over multiple years that became increasingly automated over time. Since we were running an ad exchange, we knew that the majority of our infrastructure costs was a function of the number of bid requests we sent when running an ad auction. Our data pipeline costs, on the other hand, were correlated with the number of auctions run. By breaking these down and looking at trends over time, we could track patterns and dig into anomalies.

For example, when we introduced a new type of data job or report, we would see an increase in unit costs for our data pipeline. On the other hand, when we optimized our Spark clusters to use Spot instances, we saw a decrease in unit costs. Without this visibility, it would have been incredibly difficult to understand what was happening.

The benefits of this sort of tracking compound over time. You can start accurately forecasting your infrastructure costs. You’ll instill a culture of discipline and efficiency with your organization. And, no promises, but you may even score some hard-won brownie points with your finance team.

Here’s the LinkedIn thread that inspired this post.

Twing Data

Discussion about this post