Measures, Metrics, and Signals

#guide

One of the biggest requests I hear from clients is for more data and metrics. I get it, but the underlying reason is pretty telling, and in my opinion, illuminating. The reason for the request is that they need to know what is going on and what decisions to make.

This is an overall guide to how I approach this topic, along with some specific techniques.

Principles

Less is More

It is addictive to want more and more data, but it rarely helps and, in fact, overwhelms most people. In my experience, there are just a few concrete cases where you need data, and knowing that helps eliminate the feeling that you need more and more of it.

Make a specific decision or intervention
Monitor health and trends
Investigation

I'll get into these further on, but these three categories nicely limit what you actually need. If you can describe the core decisions or interventions you need to make, you can develop signals. If you know what matters in terms of health, you know what metrics need to exist. I should note that "health" can be operational or a goal, but either way, you don't need much. As for investigation, this is the broad category when something anomalous comes up. You may need to act like an archaeologist to understand something, but that should be rare and prompt you to examine how this event factors into gaps in the other categories.

Balance

Balance is one of the more challenging topics as it requires some careful thought and planning. I've written about that in an article called How I Set Goals. The main idea is that you need to balance your measures across two dimensions. Leading and lagging as well as tradeoffs. If you use one singular unbalanced metric, you'll experience tunnel vision where the small thing you see looks good even if everything else is falling apart. Balancing your metrics helps avoid this.

Target Your Efforts

It's tempting to try to make each important measure or goal move forward at once, but I'm going to advise against it. Instead, recognize that each metric you have and each goal associated with it is acting as a constraint against the others. They often behave in unintuitive ways.

I recommend making your efforts focus on improving one metric while holding the others steady. This approach will help you better understand the relationships within the work without letting things spiral into chaos.

For example, improving delivery rates without diminishing quality is seldom explicit and almost always the opposite of what happens.

Get Started

What Goals Are Important

Goals are a great place to start, as many leaders are working toward goals set by higher-ups in the organization. While goal setting and framing are a bit of an art, you can cheat a little by answering a few questions like:

Who?
What?
By how much?
When?

Let's take a typical business outcome like revenue. We can apply this in several ways using these three questions. Here are two examples:

The business will have revenue improve by 20% by Q3.
Customers will sign up for subscriptions 15% more by Q2.

Each statement answers those four questions and leads to a concrete goal with clear metrics.

Even if you feel like you cannot possibly impact these, they are your guiding light and must be ever-present with you and your groups. They are what everything else is about and help make a lot of decisions easier when you realize you're working on things that nobody can justify.

Leading Indicators

Ok, we now need leading indicators for the goals which are lagging indicators. Creating leading indicators is an art as well, but these are often what are easier for us to change.

Good-looking leading indicators are pointless if your goals aren't moving.

Leading indicators measure what we are doing right now and tend to focus on operational excellence.

There is an implicit hypothesis you need to make explicit when developing leading indicators. The hypothesis goes like this:

"We believe that a positive leading indicator will lead to a positive lagging indicator."

Don't skip that bit. It is what protects you from getting lost in worrying about leading indicators that have no connection to what is actually important. Ignoring this hypothesis mapping is the biggest trap I see for most leaders dealing with data. Don't fall for it.

You can be the slowest team in the company and still be the most successful if you have the greatest impact.

Popular examples here are:

Velocity
Cycle time
Defect rate
Task completion
Code review time
PRs completed

Balancing

There is some balance between leading indicators and the goals, but now we're going to add more balance. Aside from the trap with leading indicators, it is very easy to simply focus on what is obvious and assume the rest will be fine.

It won't be.

For every measure you've identified, ask yourself, "What might we lose or sacrifice if we pursue this?" That will tell you what you need to add to balance with.

The single biggest fraud in software is the idea that speed costs quality. Balance measures of speed with measures of quality.

The DORA metrics are a great example of balance worth looking at, and developing your own sense of how they balance so you can do this yourself.

In DORA, there are four metrics: Lead time, Deployment frequency, Change Failure Rate, and mean time to recovery. Lead time and change failure rate balance each other, and deployment frequency and MTTR balance each other.

If you do not add these balancing measures, you'll experience degradations in your operations, teams, and other key areas that will eat up your ability to be an effective leader. You must balance your metrics.

Done

If you've made it this far, you have a few things written down. You have your business goals written, leading indicators under your control that you hope will get you there, and balancing measures to make sure things don't go sideways. That might look like this:

Goals

Improve revenue by 20% by Q3
Reduce customer complaints by 10% by Q4

Leading Metrics

Cycle time
Commit frequency
High/Critical security issues in static analysis
Defect rate
Hotfix rate
How long has the build been red

I put in a collection of metrics so you can get a sense of roughly balanced leading indicators. The idea is that with this alone, you can now always report on how you are performing and whether you're making the desired impact, without too many surprises. The scope and work you do exist to move towards your goals, and the leading indicators show you how well you're operating to that end.

Signals

Signals are a powerful tool for leaders that exist in this bucket of measures, but in a unique way. What we produced above is what we need to monitor health and trends, but we don't know yet what decisions to make or where to intervene. There's a loose sense of what we developed, but as I pointed out above, there is only a loose connection between the leading metrics and overall goals, so a change in them may not warrant your intervention.

Signals are indicators you develop to help you with that. They're not always as formal as the metrics above, and can be personal to you or operational with your groups.

The most well-known example of a signal is a "Red" or broken signal. That signal serves as a prompt for someone to investigate and fix something that's broken.

Develop Key Decisions

The hardest part of this is listing the decisions you need to make for this project and your teams. You don't have to be perfect or complete, but you do need to develop this list. Knowing which decisions you need to make or which interventions are required is the key ingredient in developing signals.

Here are some examples:

Speaking with someone who might quit soon.
Intervening because conflict has escalated too far.
Address stalemate in a technical/product decision
Manage an outage
Provide growth opportunities for my people
Manage a dependency before it's a problem
Prepare other groups for what we'll be doing
Equipping my boss in their efforts
Adding more capacity
Removing problematic individuals
Helping a team that is depressed

You can tell this list can grow infinitely, and the art is to focus on the most important few and grow it over time, only if you realize you have a blind spot. You know you have a blind spot if you got surprised by one of these things, or it went poorly last time.

How Will I Know

Time to turn these into a signal. Pair each signal to an intervention or decision from above. So you'll want to keep that pairing together so this isn't confusing in a few months.

For each decision you've started with, ask, "How will I know?" or "How might I detect the need for this?"

There is a tendency among many folks to get uneasy with this question because they believe they need a perfect combination of data. You don't. All you need is enough to make you learn more.

Don't fall into the trap of believing that without perfect data, you cannot move forward. Here are some examples I use quite often in 1:1s.

They wonder if they'd be a better fit in a different role which signals that they may be dissatisfied with their job, and I need to talk with them.

If 40% of folks in a 1:1 bring up the same basic topic, I need to investigate it now.

Is this scientific, proven, or certain? No, it isn't. It is just enough of a signal for me to act, and that is the entire point.

Developing signals for the work around you helps you build not only your internal compass for what to watch for and ask about, but also the level of instrumentation in your operations to learn about.

Most leaders I've met are shocked by an event their team dealt with that they were entirely unaware of. Those leaders had no signals and chose to hope that people would tell them when something important happened. Don't be that leader.

Done

I didn't give many examples here because signals are something that really comes with practice and reflection. Start with just a few, focusing on things that have caught you off guard in the past or things that worry you. Be honest about whether they're helpful or not and iterate.

Do take the time to write these down and keep them visible near you. These things are easy to forget in the chaos of everyday life, but if you keep them close, you'll notice how sharp your observations become about what is going on and how much useful information was slipping by before.

Visualize the Data

I want to mention a few bits about how to actually track data and make it useful, since this is another area folks tend to get wobbly.

Relative over Absolutes

There is a tendency to make metrics into absolutes, like 'reach X revenue' or 'get cycle time to Y days'. These are targets. Generally speaking, you can do this, but I am going to tell you to avoid it for one reason: you don't know if it's good or not.

Part of using metrics effectively is monitoring their change over time, and a target hides this. Absolute targets hide progress in the right direction since they are either complete or incomplete, and setting an inappropriate absolute target can cause stress and dysfunction.

To that end, use relative metrics rather than absolutes. One of the easiest ways is to describe a percentage change.

Trends over Moments

Adding trends adds some complexity to your tracking, but it makes a difference. You want to know two things about your metrics: where it is and whether it is moving in the right or wrong direction.

Trends help you assess the efforts relative to the impact on what you care about.

Your investment portfolio works the same way. It shows you what you have as a snapshot and shows you its trend over different time periods.

So let us say you wanted to reduce cycle time by 10%. You'd want to see that as your goal, then the actual cycle time, and an indicator of whether it's better or worse than the previous one or the rolling average.

10 Data Points

Folks get all bent out of shape over the idea that you need lots of data for it to be significant. This is way less true than people realize. First, having just one point of data is better than none at all, so if your concern about the data not being significant enough is preventing you from having any, shame on you.

Second, if you look into the subject of sampling, you'll learn that you need surprisingly small amounts of data to have what most people would consider an acceptable margin of error.

Third, these metrics and signals aren't meant to be proof—they're meant to act as indicators. A thermometer can't tell you you're healthy, but it can indicate you're sick. So don't get confused about the purpose of these metrics and signals with the idea of proof and correctness.

The tl;dr is: you can feel good about just 10 points of data, and if you're starting with less than that, that's ok too.

Some Measures I Use Regularly

I thought I'd share metrics I tend to start with, since they are often very applicable in most organizations.

Flow Metrics

Cycle Time

How long a part of a process takes. Generally, this metric applies to development teams from when they "Start" work until they are "Done." You have to be very specific about what it means to start work and when it is done.

People get upset about things like blocked work inflating the numbers, but that work is sitting, not getting done, so it counts. The goal is to remove the blocker.

Throughput

The rate at which work completes. Many tools measure this for you. This metric is more important than cycle time, but very few companies understand it. Little's Law helps explain this relationship.

Wait Time

How long items sit waiting to move forward is one of the biggest sources of waste. Wait time is not a metric that is useful to many people, but very illuminating when you're trying to figure out how to improve process efficiency.

Quality

Defect Rate

One of my go-to measures of quality is simply the work done compared to the bugs/defects created. It yields a percentage that I've observed between 60% and 120%. This percentage is how teams create problems as they complete work.

Note that creating problems and fixing them are two different things. Though the message still matters, we're working with a leaky bucket.

Defect Efficiency

The rate of creating bugs compared to closing them. Plenty of teams dutifully track bugs, but they don't always get fixed. Conceptually, this is fine, but in practice, most groups justify not fixing bugs because they feel it takes too long and isn't worth it.

Defect efficiency indicates the group's and the company's tolerance for shipping problems to customers.

Change Failure Rate

When I calculate defect rate, I do it in absolute terms. If you want to get more specific about something, like an issue in production, this is the more focused version. You measure the completed work that went into production compared to the issues uncovered in production.

This nuance isn't useful for many teams, but it again points out how leaky the processes are and how tolerant we are to shipping problems to customers. Be careful with QA on this one, as they are often very sensitive to the fact that they never have time or resources to test anything, and this feels like an attack on them.

MTTR

Mean Time To Recovery is another cousin of defect efficiency that measures how long it takes to fix an issue in production after it is identified. MTTR is a rolling average. Much like the cycle time issue, where blocked items sit there running the clock, small bugs that surface also run the clock here.

Again, it surfaces how tolerant we are of our customers living with issues and our own operational capacity to safely resolve them.

Product

Pirate Metrics

Acquisition, Activation, Retention, Revenue, and Referral, or AARRR. Acquisition is a user arriving at our service, activation is their entry into our service, retention is their return to the service, revenue is revenue, and referral is organic growth.

Each of these will have a specific meaning for your product, and how you instrument this will be unique depending on the goals and product maturity.

Churn Rate

Where customers leave, now this can also mean in a specific workflow within the app, but in engineering groups, I like to create a churn rate tied to customer loss related to quality. Customer support or customer success teams will know this.

Lead Time

Lead time is the classic, "Ask to get." I keep an eye on this but rarely bring it up. The reason it isn't relevant is that most companies are comfortable with their planning processes, committees, and budget cycles, so pointing out that it takes 1.5 years to get something isn't a conversation most are willing to have productively.

Still, this is useful to keep because there will be pressure to "deliver faster," and the bottleneck isn't likely where anyone thinks it is.

Leadership

1:1 Sentiment

I track sentiments across 1:1s for individuals and across groups. My measures will be simple, like X/Y or X% of folks expressing a frustration or sentiment that prompts me to investigate.

KPI/North Star

These things almost always exist somewhere but often get lost in the frenzy to track and report on "progress" and "activity." I like to put these back front and center in every report and communication so they're always part of every conversation.

When this doesn't exist, I create one, test whether it resonates with senior leaders, and use that as an excuse to develop a more appropriate one or to find the real one that has been left out.

Miscellaneous

Process Failure Rate

Everything has a process attached, and it is often shocking to folks how rarely that process is successful.

Process failure rate is the ratio of items that make it through to those that the team sends backwards at some point along the way.

It is important to note that many processes explicitly do this as they have numerous quality gates built in for this purpose. So this rate isn't always indicative of bad things happening, but rather how much waste there is.

Interruptions

Another shocking metric for many groups is how often unplanned work gets accomplished compared to planned work. This concept is also called side-loading by some folks. Side-loading occurs when someone taps someone on the shoulder to make a personal request. Also, this could be an emergency issue.

Either way, tracking the rate at which this happens is often surprising to folks. While fixing outages isn't a bad thing, having them is, and this shows the impact of poor upstream quality on downstream productivity.

It also begins to point out how hard it is for a team to reliably or predictably accomplish work when so much of it is a surprise. Be warned that almost everyone wants to put an end to this, even if they are the person side-loading work into the team.

Deployment Frequency

Part of DORA. Deployment is complicated at most companies, and deploying quickly and safely requires operational excellence. It also allows for better risk recovery.