[Note: Lately, I’ve been talking a lot about fitness for purpose and fitness criteria. Other than David Anderson and a few others, though, not much material exists — at least not applied in the software-delivery space — to point people to for further reading. So I’m jotting down some ideas here in the hopes of furthering the discussion and understanding.]
- The first step in improving is understanding what makes the service you provide fit for its purpose.
- Fitness is always defined externally, typically by the customer
- Fitness for purpose has two components: a product component and a service-delivery component
- Fitness criteria are metrics that enable us to evaluate whether our service delivery and/or product is fit for purpose
- Of the two major categories of metrics, fitness criteria are primary, whereas health or improvement metrics are derivative
- Examples of service delivery fitness criteria are delivery time, throughput and predictability
Fitness for purpose is an evaluation of how well a product or service fulfills a customer’s desires based on the organization’s goals or reason for existence. In short, it is the ability of an organization or team to fulfill its mission. The notion derives from manufacturing industry that purportedly assesses a product against its stated purpose. The purpose may be that as determined by the manufacturer or, according to marketing departments, a purpose determined by the needs of customers. David Anderson emphasizes that
Fitness is always defined externally. It is customers and other stakeholders such as governments or regulatory authorities that define what fitness means.
Fitness criteria then are metrics that enable us to evaluate whether our product, service or service delivery is “fit for purpose” in the eyes of a customer from a given market segment. As Anderson notes, fitness criteria metrics are effectively the Key Performance Indicators (KPIs) for each market segment, and as such are direct metrics.
As Anderson explains,
Every business or every unit of a business should know and understand its purpose … What exactly are they in business to do? And it isn’t simply to make money. If they simply wanted to make money they’d be investors and not business owners. They would spend their time managing investment portfolios and not leading a small tribe of believers who want to make something or serve someone. So why does the firm or business unit exist? If we know that we can start to explore what represents “fitness for purpose.”
For me, fitness is something that, like user stories, can be understood at varying levels of granularity. Organizations have fitness for their purpose — “are we fit to pursue this line of business?” — and teams (in particular, small software-delivery teams) also have fitness for their purpose — “are we fit to delivery this work in the way the customer expects?”
Therefore, the first step in improving is understanding what makes the service you provide fit for its purpose. Fitness for purpose is simply an evaluation of how well an organization or team delivers what it is in the business of (its purpose). Modern knowledge-worker organizations like Asynchrony often focus on concerns like product development or technical practices, sometimes overlooking service-delivery excellence. But service delivery is a major reason why our customers choose us. That’s why we attempt to understand and define each project team’s purpose and fitness for that purpose at the project kickoff in a conversation with our customer representatives.
Two Components of Fitness
Fitness for purpose has two components: a product component and a service-delivery component. That is, the customer for your delivery team considers the product that you are building (the what) — did you build the right thing? — as well as the way in which you deliver it (the how) — how reliable were you when you said you’d deliver it? How long did it take you to deliver it? We have useful feedback mechanisms for learning about the fitness of the products we build (e.g., demos/showcases, usage analytics), but how do we learn about the fitness of our service delivery? That’s the service-delivery review feedback loop, which I will write about later.
Fitness criteria are metrics which enable us to evaluate whether our service delivery is “fit for purpose” in the eyes of a customer from a given market segment. These are usually related to but not limited to delivery time (end to end duration), predictability and, for certain domains, safety or regulatory concerns. When we explore and establish expectation levels for each criteria, we discover fitness-criteria thresholds. They represent the “good enough” or the point where performance is satisfactory. For example, our customer may expect us to deliver user stories within some reasonable time frame, so we could say that for user stories, our delivery-time expectation is that 85% of the time we complete them within 10 days. We might have a different expectation for urgent changes, like production bug fixes.
Fitness criteria categories are often common — nearly everyone cares about delivery time and predictability, for instance — the actual thresholds for them are not. While some are shared by many customers, the difference in what people want and expect allow us to define market segments and understand different business risks. Fitness criteria should be our Key Performance Indicators (KPIs), and teams should use those thresholds to drive improvements and evolutionary change.
Who Defines Fitness?
As opposed to team-health metrics, like happiness or pair switches, fitness and fitness criteria are always defined externally: Customers and other stakeholders define what fitness means. That means you cannot ask the delivery team to define its fitness. They cannot know because they are not the ones buying their service or product. We should be asking customers “What would make you choose this service? What would make you come back again? What would encourage you to recommend it to others?”
These are a team’s fitness criteria and these are the criteria by which Asynchrony should be measuring the effectiveness of our teams’ service delivery. Then we’ll be improving toward the goal, the greater fitness for our purpose, both as an organization and as individual delivery teams. By integrating fitness-for-purpose thinking into everything we do, we will create an evolutionary capability that will help us sense changes in market needs and wants and what those different market segments value. As a result, Asynchrony will continue to thrive and survive in the midst of our growth and growing market complexity.
Difference Between Fitness Metrics and Health Metrics
|Fitness Metric||Health Metric|
|Metric that enables us to evaluate whether our product, service or service delivery is “fit for purpose” in the eyes of a customer from a given market segment. Effectively comprise the Key Performance Indicators (KPIs) for each market segment.||Metric that guides an improvement initiative or indicates the general health of your business, business or product unit or service delivery capability.|
|Examples: delivery time, functional quality, predictability, net fitness score||Examples: flow efficiency,velocity, percent complete and accurate,WIP|
|Customer-oriented and derived||Team-oriented and derived|
A Food Example
I like to use food for examples (also to eat). Is a restaurant in the product or service-delivery business? That’s a trick question, of course: The answer is “both.” As a customer, you care about the meal (product) but also about the way you have it provided (service delivery). And those always vary depending on what you want: If you want cheap and fast, like a burger and fries at McDonald’s, you may have a lower expectation for the product (sorry, Ronald) but a higher one for delivery speed. Conversely, if you’re out for fine dining, you expect the food to be of a higher quality and are willing to tolerate a longer delivery time. However, you have some thresholds of service even for four-star restaurants: For example, if you have a reservation, you expect to be seated within minutes of your arrival. And you expect a server to take your order in a timely way. If you don’t have a reservation, the maitre d’ or hostess will perhaps quote you an expected wait time; if it’s unacceptable, you’ll go elsewhere. If it’s acceptable but they don’t seat you in that time, you are dissatisfied. The service delivery was not fit for its purpose, which is to say the reason why you chose to eat there.
A Software-Delivery Example
The restaurant experience is actually not too dissimilar from software delivery. The customer expects software (product) but also expects it on certain terms or within certain thresholds (service delivery). A team works hard to deliver the right features and demonstrates them at some frequency; at the demo, the team likely will explicitly ask “is this what you wanted?” What’s often missing is the “are these the terms on which you wanted it?” Whether in the demo or a separate meeting, we need to also review service delivery. This is where we look at whether our service meets expectations: Did we deliver enough? Reliably enough? Respond to urgent needs quickly enough? The good news is that we can quantitatively manage the answers to these questions. Using delivery times, we can assess whether the throughput is within a tolerance. One team used a probabilistic forecast and found that their throughput was not likely to help them reach their deadline in time. Conversely, another realized that they were delivering too fast and could stand to reallocate people to other efforts. Also, for instance, when we set up delivery-time expectations (some people call these SLAs), like delivering standard-urgency work at a 10-day, 85% target, we can then make decisions based on data rather than feelings or intuition (which have their place in some decisions but not others). These expectations needn’t be perfect or “right” to begin; set them and begin reviewing them to see if they are satisfactory.
Having an explicit review of fitness criteria, especially for service-delivery fitness, is a vital feedback loop for improving. Rather than having the customer walk away dissatisfied for some unknown reason, we can proactively ask and manage those expectations and improve upon them. Often these are the unstated criteria that ultimately define the relationship and create (or erode) trust; discover them and quantitatively manage them.
I’ve been play-testing a new simulation game that I developed, which I’m calling the NoEstimates Game. Thanks to my friends and colleagues at Universal Music Group, Asynchrony and the Lean-Kanban community (Kanban Leadership Retreat, FTW!), I’ve gotten it to a state in which I feel comfortable releasing it for others to play and hopefully improve.
The objective is to learn through experimentation what and how much different factors influence delivery time.
[Jan. 3, 2017 update: Game materials are now available on GitHub]
Download these materials in order to play:
If you’d like to modify the original game elements, here they are:
I’m releasing it under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, so please feel free to share and modify it, and if possible, let me know how I can improve it.
March has begun, and that means March Madness, when many Americans turn their attention toward basketball, with unbridled hopes in NCAA tournament brackets and the thrills of underdog upsets. Basketball offers not only the promise of nail-biting college tilts but also some helpful metaphors for software delivery. Read on for an assist for your team!
Practice Your Free Throws
I was never a star basketball player (I made it only as far as the high-school sophomore team), so I was keenly aware of my need to practice in order to build my skills. For instance, I always found that if I were having trouble making field goals (which was not infrequent!), it helped to practice free throws. With no jumping involved and no one defending me, this allowed me to simplify my form and focus on the basics. I reasoned that, if I couldn’t hit a free throw, I had no business trying longer-range shots in complex situations. Even now, I still can’t figure out why players who take most of their shots from behind the three-point line can’t seem to reliably make free throws.
The same is true in software delivery. For instance, before you can realize the goal of continuous delivery, you need to discipline yourself in automated testing and continuous integration. Be able to reliably answer in the affirmative Jez Humble’s three questions:
- Does everyone check into mainline (at least) once per day?
- Do you have a suite of tests to validate your changes?
- When the build breaks, is fixing it the team’s #1 priority?
If your team aspires to continuous delivery, you can’t keep chucking up the same code or try to do it in the midst of delivery commitments and deadlines with a bolted-on “devops team.” You need to slow down in order to speed up — take time to write tests at the proper levels and integrate continuously. If your throughput is lower to begin, so be it. It’ll be higher in the long run.
Planning Flow During the Timeout
If I had scored a point for every standup with the report-to-the-leader anti-pattern that I’ve witnessed, I’d have made varsity. I understand the accountability idea behind Scrum’s three questions, but I have rarely seen it implemented in practice in a healthy way. The standup tends to be rote, individual-oriented, low-energy and low-value, with teams sometimes abandoning them for “real work.”
Contrast this with a timeout in basketball. It’s fast, full of energy and purposeful. Why? The timeout is focused on how the team can work together in the next short period of play. That’s it. Imagine if the coach went around the circle demanding that each player describe what he had been doing:
- “Well, coach, I missed two jumpers but made a free throw.”
- “I’ve been guarding #18. Still plan to guard him after the timeout.”
- “I’ve been running up and down the court. No blockers.”
Anyone who has been watching the game knows these things! Likewise, we were all in the office yesterday; we know you’ve been working. In a timeout, individuals don’t report status; the team proactively solves its main impediments to flow:
- “They’re double-teaming me, coach. That means someone is going to be free — let’s get Christopher or James the ball more.”
- “I can’t keep up with #18 — Chike, can you drop down and help me guard him in the low post?”
In a timeout, conversation is lively and self-organizing; no one waits to be called on. When the timeout ends, the team runs back onto the court knowing the plan. Does your software-delivery team know the plan when standup ends? Treat standup (a.k.a. daily flow planning) more like a basketball timeout, and orient your standups toward the team and flow.
A Whole-Team Approach to the 7-foot Constraint
That brings us to one last metaphor from basketball: System constraints are like a defense that you have to dynamically figure out. The Theory of Constraints tells us that every system has a constraint that governs its output. In basketball, this constraint is sometimes easy to spot, whether it’s the 7-foot dude who is blocking everyone’s shots, or your point guard who keeps turning the ball over. In basketball, both on the playground and on elite NCAA courts, teams adapt to their constraints. It happens so fast in basketball that we don’t even think about it: If the 7-foot dude blocks shots from close range, a coach may deploy a lineup of better perimeter shooters or a player who is quicker and can draw fouls from the big man. Another example is a double-team situation: The team doesn’t expect a double-defended player to try to keep scoring — no, the team comes to help him, since one player is usually free. Basketball players do this almost instinctively, because they share a common goal: Score more points than the opponent.
In knowledge work, constraints are more difficult to see, and a lack of goal-orientation inhibits whole-team approach to the constraints. For example, if a person is “free,” it’s easy for a dev to pull in new work, heedless of how busy or “double-teamed” the QA is. That’s why we use WIP limits and make our constraints visible with tools like cumulative-flow diagrams. (In basketball, the WIP limit is one: It’s called the ball. When your teammate is double-teamed and you are unguarded in the open, you don’t grab another ball from the sidelines and start playing, do you?) Whereas basketball players naturally practice the art of work leveling by constantly taking a whole-team approach to constraints, we in software development can do the same. We merely need the help of simple job aids and a shared goal, which doesn’t mean staying busy as individuals, but means finishing work.
Scrum gave us the Three Questions to help structure discussion at daily standup. These questions provide some idea of micro-goal setting and accountability for each team member and can be a healthy practice:
- What have you completed since the last meeting?
- What do you plan to complete by the next meeting?
- What is getting in your way?
For teams who are increasingly focusing on optimizing flow or teams who have simply fallen into a pattern of rote repetition and are in need of a fresh approach, I offer what you might call “the new three questions,” inspired by Mike Burrows in his book Kanban from the Inside:
- How can we improve flow today?
- What is blocked and why?
- Where are bottlenecks forming?
A colleague observed that those questions sound like a mini retrospective, which is not a bad analogy insofar as they are about improvement, though perhaps not as backward facing; they focus on the present and near-future reality. They’re about making a plan to improve flow, with the scope being merely a day. I like the questions because they orient the team toward the work, rather than the worker. For teams that already follow the practice of making work visible, the new three questions are a natural complement to “walking the wall.” Furthermore, the answers to these questions over time can inform the conversation at operations review and risk review, helping the team analyze their work-in-progress limits and blocker clusters.
Like any practice, without attention to the “why” and context, they can lead to to mindless repetition. But if flow is important to you — and it should be — “the new three questions” can help you improve it with a simple twist on an old reliable pattern.
Daniel Vacanti’s new book, Actionable Agile Metrics for Predictability, is a welcome addition to the growing canon of thoughtful, experience-based writing on how to improve service delivery. It joins David Anderson’s (Kanban: Successful Evolutionary Change for Your Technology Business) and Mike Burrows’s (Kanban from the Inside) books in my list of must-reads on the kanban method, complementing those works with deeper insight into how to use metrics to improve flow.
Daniel’s message about orienting metrics to promote predictable delivery and flow — which he defines as “the movement and delivery of customer value through a process” — is primarily grounded in his experience helping Siemens HS. He includes the case study (which has been published previously and is valuable reading in itself) at the end of the book, so he keeps the rest of the book free from too many customer references, though he’s drawing on the pragmatic experience.
As someone who for several years has been helping teams and organizations improve using the metrics Daniel talks about, I learned a tremendous amount. One of the reasons is that Daniel is particularly keen to clarify language, which I appreciate not only as a former English major (nor as a pedant!), but because it helps us carefully communicate these ideas to teams and management, some of whom may be using these metrics in suboptimal ways or, worse, perverting them so as to give them a bad name and undermine their value. Some examples: The nuanced difference between control charts and scatterplots and clear definitions on Little’s Law (and violations thereof), especially as related to projections and cumulative flow diagrams. I certainly gained a lot of new ideas, and Daniel’s explanations are so thorough that I suspect even novice coaches, managers, team leaders and team members won’t be overwhelmed.
As for weaknesses, I felt that the chapter on the Monte Carlo method lacked the same kind of depth as the other chapters. And I came away wishing that Daniel had included some diagrams showing projections using percentiles from scatterplot data. But those are minor plaints for a book that constantly had me jotting notes in my “things to try” list.
Overall, I loved how Daniel pulled together (no pun intended), for the purpose of flow, several metrics and tools that have often been independently implemented and used and whose purpose— in my experience — was not completely understood. The book unifies these and helps the reader see the bigger picture of why to use them in a way I had not seen before. If you’re interested in putting concepts and tools like Little’s Law, cumulative flow diagrams, delivery-time scatterplots and pull policies into action, this book is for you.
- The book has a very helpful and clarifying discussion of classes of service, namely the difference between using CoS to commit to work (useful) and using it to prioritize committed work (hazardous for predictability).
- It also had a particularly strong treatment of cumulative flow diagrams.
- Daniel does a lot of myth debunking, which I appreciate. Examples: work items need to be of the same size, kanban doesn’t have commitments.
- The tone is firm and confident — you definitely know where Daniel stands on any issue — without being strident.
I’ve been helping a team at Asynchrony improve using blocker clustering, a technique popularized by Klaus Leopold and Troy Magennis (presentation, blog post) that leverages a kanban system to identify and quantify the things that block work from flowing. It’s premised on the idea that blockers are not isolated events but have systematic causes, and that by clustering them by cause (and quantifying their cost), we can improve our work and make delivery more predictable.
The team recently concluded a four-week period in which they collected blocker data. At the outset of the experiment, here’s what I asked a couple of the team leaders to do:
- Talk with your teammates about the experiment
- Define “block” for your team
- Minimally instrument your kanban system to gather data, including the block reason and duration
The first two were relatively simple: The team was up for it, and they defined “blocker” as anything that prevented someone from doing work if he had wanted to. “Instrumenting the system” wasn’t as easy as it could’ve been, because the team uses a poorly implemented Jira instance, so they went outside the system and used post-it notes on a physical wall. They then kept a spreadsheet with additional data (duration, reason) to tie the blockers back to their Jira cards.
Over the next four weeks, the team collected 19 blockers, placing each post-it note on the wall in either an “internal” (caused by them) or “external” (caused by something outside the team, including customer and dependent systems) column. We then gathered in a conference room to convene a blocker-analysis session to:
- further cluster the blockers into more discrete categories
- calculate how many days of delay (the enemy of flow!) the blockers caused
- root-cause the internal categories
- find out where to focus improvement efforts
The analysis session was eye-opening. We started with the two main columns (internal, external) and as we quickly discussed each blocker, sub-categories, such as “dependency” and “waiting on info” emerged. Within minutes, we were able to see — and quantify the delay cost — of the team’s most egregious blockers.
internal blockers caused 20 days worth of delay
external blockers caused 147 days worth of delay
- the biggest blocker cluster accounted for 86 days of delay
That biggest blocker cluster now allows the team to have a conversation with the customer that goes something like this: “Over a four-week period, we had three blockers in this area. If this continues, that means you have a 75% chance each week of creating a blocker that costs you an average of 29 days. Is this acceptable to you?”
Ultimately, it may indeed be acceptable. But the customer is now aware of the approximate cost of problems and can manage risk in an informed way.
For the internal blockers, we conducted a root-cause analysis (using fishbone technique, though I’ll admit my “fish” leaves something to be desired!). So the team can go forward to address both external blockers (through a conversation with the customer) and the internal blockers (through their decisions).
Other lessons learned:
- Some blockers turned out to be simply time spent chasing information about backlog items rather than true blockers of committed work, so the team added “… for committed work” to their blocker definition. (It’s important to understand your commitment point.)
- Depending on how you want to address blockers, you might choose to sort them differently. For example, the team considered sorting its external blockers not by source but by which customer contact was responsible.
I wrote a few weeks ago about the advocacy program, our distributed peer-to-peer continuous-improvement program. One of the important components of the program is autonomy support. But what is that? As Daniel Pink notes in his book Drive:
Researchers found greater job satisfaction among employees whose bosses offered “autonomy support.” These bosses saw issues from the employee’s point of view, gave meaningful feedback and information, provided ample choice over what to do and how to do it, and encouraged employees to take on new projects.
In the advocacy program, autonomy-support meetings are an optional opportunity for employees to meet with executive management to give feedback on how the executive leaders can help the employee realize career goals in the organization. The meeting can be scheduled by the employee’s advocate, who also can be part of the meeting, acting as an intermediary or ambassador for the employee to the manager(s). Multiple managers may be part of the meeting, depending on which ones the advocate and employee feel are vital and able to help.
The dynamic should be one in which the traditional organizational structure is flipped upside-down:
Therefore, rather than the traditional dynamic of the employee “working for” the manager, in the autonomy-support meeting the servant-leader — in this case, the role of executive leader — should have the mindset of “working for” the employee.
A good starting point for the discussion is the “autonomy-support feedback for executive leaders” section of the employee’s review. Basically, it’s whatever the employee needs executive leaders to do in order to do his or her job better or reach goals. This might be a request for a different project or role switch, more time to explore a particular skill or technology or simply clearer vision or expectations set. Premised on the executive leader’s commitments to the employee, the employee has the right to ask for the executive leader for support in various career-development goals, including timelines for when those things would occur.
Questions that the executive leader might want to ask:
- How can I help you realize your goals in the next year?
- By when would you like me to achieve these things for you?
- In what areas have I failed to help you in the past, and how can I improve?
- What kind of things would help you feel more engaged?
- How can I help smooth your path toward mastery of certain skills?
- What does success look like for you, and how can I help you succeed?