aadhish.
← all writing

How to reduce New Relic data usage without going blind

February 10, 2025·2 min read

Most teams discover New Relic costs the same way: a bill arrives that's twice what was budgeted, somebody blames "the AI logging stuff," and then there's a panic-driven retention reduction that takes useful data with it.

There's a better way. We've done this exercise a dozen times for customers in the last two years. Here's the playbook.

Where the money actually goes

In our experience, three line items account for most of a typical bill:

  1. High-cardinality custom events. Especially events with user_id or request_id as dimensions.
  2. Verbose logs from a few noisy services. Usually 3 services produce 80% of the log volume.
  3. APM detail that nobody looks at. Distributed traces from low-traffic background jobs at full sampling.

Cut these three thoughtfully and you'll often halve the bill while losing nothing you actually use.

A starter recipe

Here's the rough order of operations we follow:

1. Find the noisy services

SELECT bytecountestimate()/1e9 AS gb
FROM Log
FACET service.name
SINCE 7 days ago
LIMIT 20

You will almost always find that one or two services produce wildly more logs than the rest. Investigate them first.

2. Drop debug-level logs at the source

Don't try to filter debug logs in New Relic. Cut them at the agent or shipper. You're paying to transmit them otherwise.

3. Cap custom event cardinality

If you're sending events with user_id, request_id, or anything else unique-per-request as a dimension, replace those with aggregates. You almost never need per-request facets in dashboards.

4. Sample distributed traces

Full-sample your top 3 critical paths. Sample everything else at 1–5%. You'll keep all the useful information for less than a tenth of the cost.

5. Set retention by data type

Logs: 7–14 days is enough for most teams. Metrics: 30–90 days. Events: 7 days for high-volume, 30 days for low-volume business events.

What not to do

A few things that have bitten us:

  • Don't blanket-cut retention. Audit by data type. Aggregated metrics are cheap to keep; per-event detail is not.
  • Don't drop production error logs. Whatever you save on the bill, you'll lose ten-fold the first time you can't debug a production incident.
  • Don't roll changes without a baseline. Save your dashboards as JSON before you cut, so you can verify nothing critical broke.

We've taken bills from $14k/month down to $4k/month doing exactly this on a customer's fleet, with zero loss of operational visibility. The work isn't glamorous, but it pays back fast.

the studio

This piece was written by the Adhish team. We build small, sharp products that solve real problems. If this resonated, come say hello or browse what we've built.