Most teams discover New Relic costs the same way: a bill arrives that's twice what was budgeted, somebody blames "the AI logging stuff," and then there's a panic-driven retention reduction that takes useful data with it.
There's a better way. We've done this exercise a dozen times for customers in the last two years. Here's the playbook.
Where the money actually goes
In our experience, three line items account for most of a typical bill:
- High-cardinality custom events. Especially events with
user_idorrequest_idas dimensions. - Verbose logs from a few noisy services. Usually 3 services produce 80% of the log volume.
- APM detail that nobody looks at. Distributed traces from low-traffic background jobs at full sampling.
Cut these three thoughtfully and you'll often halve the bill while losing nothing you actually use.
A starter recipe
Here's the rough order of operations we follow:
1. Find the noisy services
SELECT bytecountestimate()/1e9 AS gb
FROM Log
FACET service.name
SINCE 7 days ago
LIMIT 20You will almost always find that one or two services produce wildly more logs than the rest. Investigate them first.
2. Drop debug-level logs at the source
Don't try to filter debug logs in New Relic. Cut them at the agent or shipper. You're paying to transmit them otherwise.
3. Cap custom event cardinality
If you're sending events with user_id, request_id, or anything else unique-per-request as a dimension, replace those with aggregates. You almost never need per-request facets in dashboards.
4. Sample distributed traces
Full-sample your top 3 critical paths. Sample everything else at 1–5%. You'll keep all the useful information for less than a tenth of the cost.
5. Set retention by data type
Logs: 7–14 days is enough for most teams. Metrics: 30–90 days. Events: 7 days for high-volume, 30 days for low-volume business events.
What not to do
A few things that have bitten us:
- Don't blanket-cut retention. Audit by data type. Aggregated metrics are cheap to keep; per-event detail is not.
- Don't drop production error logs. Whatever you save on the bill, you'll lose ten-fold the first time you can't debug a production incident.
- Don't roll changes without a baseline. Save your dashboards as JSON before you cut, so you can verify nothing critical broke.
We've taken bills from $14k/month down to $4k/month doing exactly this on a customer's fleet, with zero loss of operational visibility. The work isn't glamorous, but it pays back fast.
This piece was written by the Adhish team. We build small, sharp products that solve real problems. If this resonated, come say hello or browse what we've built.