The Sales Execution Gap

User Engagement Metrics: The Ones That Lie and the Ones That Don't

Most user engagement metrics count motion, not progress. A rep forced to log in daily scores high and does nothing. Here is which metrics survive contact with Goodhart's Law.

User engagement metrics measure how much people use a product, and they split into surface metrics that count motion (logins, sessions, clicks) and outcome metrics that count progress (did the key behavior happen, did the work move forward), with only the second resisting gaming.

Picture two reps and one dashboard. The first logs in every morning, leaves the tool open all day, and racks up an enviable session count. The second opens the tool for four minutes, updates the one field that moves the deal, and closes it. On every user engagement metric the dashboard tracks, the first rep is winning. On the only thing the company is paying for, the second one is. The dashboard is not lying, exactly. It is answering a question nobody should have asked.

User engagement metrics are the measures of how much people use a product, and they split into surface metrics that count motion (logins, sessions, clicks) and outcome metrics that count progress (did the key behavior happen, did the work move forward), with only the second resisting gaming. Keep that split in front of you, because almost every misleading dashboard is built entirely from the first kind.

The distinction is not new, and the person who sharpened it for software was Eric Ries. In The Lean Startup he drew the line between vanity metrics, the numbers that go up and to the right and make everyone feel good, and actionable metrics, the ones that tell you whether the thing you care about is actually happening (Eric Ries, The Lean Startup). His test was simple and brutal: if a metric cannot change a decision, it is vanity, no matter how impressive the chart. Total registered users only ever rises, so it flatters and informs nothing. Engagement dashboards are full of these. They are popular precisely because they are reassuring, and reassurance is the opposite of the job a metric is supposed to do.

Which user engagement metrics count motion, and which count progress?

The two families look similar on a slide and mean opposite things in practice.

  • Surface metrics. Logins, sessions, time-in-app, daily and monthly active users, clicks, page views. Easy to collect, easy to chart, and easy to fake. They tell you a user was present, not that anything happened.
  • Outcome metrics. Did the behavior the product exists to enable happen, and did the work move forward because of it. Harder to instrument and far harder to game, because gaming them means doing the work.
Two user engagement metrics, two stories: surface metrics count motion (logins, sessions, time-in-app, clicks, page views, DAU/MAU, easy and easy to game, a rep forced to log in daily looks engaged and does nothing, high score no movement); outcome metrics count progress (did the key behavior happen, did the work move forward, tied to the result not the click, the behavior either happened on the work or it did not, harder to fake and worth chasing).
The same login count can mean opposite things. Surface metrics record presence; outcome metrics record whether the work moved.

The same split runs through adoption metrics generally: the easy ones count presence and the hard ones count whether the work got done. The trap is that the surface metrics are the easy ones, so they are the ones that get tracked, charted, and reported up. A team optimizing for product engagement on those numbers can post a beautiful curve while the behavior that pays the bills sits flat.

Why do login and session metrics mislead?

Because motion is not progress, and the dashboard cannot tell the difference. A rep required to keep the tool open all day looks maximally engaged and may do nothing of value. A rep who uses it briefly but correctly looks lightly engaged and does the real work. The same number, two opposite realities. This is the odometer problem: it counts the miles without knowing whether anyone arrived, and a car idling in the driveway can run the odometer all day.

How does Goodhart’s Law kill an engagement metric?

The moment you turn a surface metric into a target, it stops measuring what you wanted. The economist Charles Goodhart’s observation, now a law, is that any measure that becomes a target ceases to be a good measure (Goodhart’s Law). Make daily logins the goal and you will get daily logins: people log in, do nothing, and log out, and the metric now records empty logins. You optimized the proxy and lost the thing the proxy stood for.

The cleanest picture of this is a true story, the cobra effect. A colonial government in Delhi wanted fewer venomous cobras in the city, so it paid a bounty for every dead cobra turned in. The metric, dead cobras collected, climbed beautifully. The goal, fewer cobras, did not, because enterprising residents started breeding cobras to kill for the bounty. When the government noticed and scrapped the program, the breeders released their now-worthless snakes, and the city ended up with more cobras than when it began. The number went up the whole time. The thing the number was supposed to track went the other way.

The cobra effect as a picture of Goodhart's Law: a government pays a bounty per dead cobra to reduce cobras, people breed cobras to collect bounties, the program ends, breeders release their snakes, and the cobra population rises. The metric (dead cobras turned in) went up while the goal (fewer cobras) got worse. Below, a surface metric as target (daily logins) gets gamed into empty logins, while an outcome metric as target (the deal-qualifying step done right) cannot be gamed because doing the step is the goal.
Reward the proxy and you get the proxy: the bounty grew the cobra population, and a login target grows empty logins. Outcome metrics survive because the proxy and the goal are the same thing.

This is not a quirk of bounties or of British India. Donald Campbell, a social scientist, stated the same law for human systems years before Goodhart: the more any quantitative indicator is used for decision-making, the more it will be gamed and the more it will distort the process it was meant to monitor (Campbell’s Law). Two independent thinkers, an economist and a social scientist, arrived at the same conclusion from opposite directions, which is the strongest sign a finding is real. A login target on a sales tool is a small cobra bounty. The reps are not villains for clicking through it; the system rewarded presence, so it got presence. Blaming them misreads the problem, which is the metric, not the people.

Goodhart's Law in a user engagement dashboard: pick a metric (daily logins) leads to make it the target (everyone logs in daily) leads to people game it (log in, do nothing, log out), so the metric now measures empty logins, not engagement, and the signal is dead; outcome metrics resist this because you cannot fake the work moving forward.
When a measure becomes a target, it stops measuring. Outcome metrics resist this because you cannot fake the work moving forward.

Outcome metrics survive Goodhart because gaming them requires doing the work. If your metric is “the deal-qualifying step was completed correctly,” the only way to score is to complete the step correctly, which is the behavior you wanted. The measure and the goal are the same thing, so there is no proxy to game.

This is the same reason activity tracking is good but never sufficient on its own: capturing what a user did is necessary and honest, and it must be paired with whether the work moved forward, a pairing laid out in pipeline hygiene and user adoption. The error is not tracking activity. The error is mistaking the activity for the outcome and stopping there.

How do you pick an outcome metric that does not lie?

Find the single behavior the product exists to produce, and measure whether it happened. Sean Ellis, who coined the term growth hacking, built the now-standard North Star Metric practice on exactly this: pick the one number that captures the core value the customer gets, and let the vanity metrics orbit it as diagnostics, never as goals (Sean Ellis on the North Star Metric). For a messaging app the North Star is messages sent, not signups. For a sales tool it is the deciding step run correctly on a real deal, not minutes in the app. The test for a good outcome metric is the gaming test: imagine someone trying hard to make the number go up while delivering no value, and if they cannot, the metric is sound.

A few rules keep an engagement dashboard honest:

  • One outcome metric on top. Lead with the single behavior the product exists to produce, and judge everything against it. A dashboard with twelve coequal numbers has no North Star and will drift toward the easiest one.
  • Surface metrics as diagnostics only. Logins and sessions earn a place as leading sanity checks, the canary that tells you something upstream broke, but never as a goal anyone is rewarded for hitting.
  • The gaming test on every target. Before you set a target on any metric, ask whether someone could hit it while delivering nothing. If yes, it is a surface metric in disguise, and the day it becomes a target is the day it stops telling the truth.
  • Pair activity with progress. Track what the user did, because that is honest and necessary, and pair it with whether the work moved, because activity alone advances nothing on its own.

What we recommend

Build the dashboard outcome-first. Lead with the metrics that count progress, the key behavior happening and the work moving forward, and report the surface metrics (logins, sessions) only as leading sanity checks, never as goals. Never set a target on a surface metric you cannot defend against Goodhart, because the day it becomes a target is the day it stops telling you the truth. The point of measuring engagement was always to know whether people are doing the work the tool exists for. Measure that directly, and the vanity numbers become what they should have been all along: a footnote.

From here: the system view in user adoption, the activity-plus-outcome balance in pipeline hygiene, and why training does not move the number in technology adoption.

Frequently asked questions

What are user engagement metrics?+
User engagement metrics are the measures of how much and how deeply people use a product. They divide into surface metrics that count motion (logins, sessions, time-in-app, daily and monthly active users, clicks) and outcome metrics that count progress (whether the key behavior happened, whether the work moved forward). Both are useful, but only the outcome metrics survive contact with incentives.
Which user engagement metrics actually matter?+
The outcome ones: did the behavior the product exists to enable actually happen, and did the work move forward as a result. Surface metrics like logins and session counts are easy to instrument and easy to game, so a user pressured to use a tool can score high while doing nothing valuable. Outcome metrics are harder to fake because you cannot fake the work moving forward.
Why are login and session metrics misleading?+
Because motion is not progress. A rep required to log in daily looks fully engaged on the dashboard and may accomplish nothing in the tool, while a rep who uses it briefly but correctly looks less engaged and does the real work. The same login count can mean opposite things, so a metric that counts logins measures presence, not value.
What is Goodhart's Law and how does it affect engagement metrics?+
Goodhart's Law holds that when a measure becomes a target, it stops being a good measure. Make daily logins the goal and people log in, do nothing, and log out, so the metric now records gaming rather than engagement. Outcome metrics resist this because gaming them would require actually doing the work, which is the behavior you wanted in the first place.

Your process, running itself.

Turn the playbook into rep behavior.

Book a demo Read The State of Sales Enablement