Flight Levels and Aligned Autonomy

Flight Levels is a thinking model for organizational improvement. As Klaus Leopold says, it “helps you find out where in an organization you have to do what in order to achieve the results that you want.” Flight Levels is effective at that because it stresses the idea of leverage and coherence across the multiple strata and teams of an organization.

In doing so, it provides a way to model and marry strategy development and strategy deployment, effectively fostering a way for leadership at every level to take root and flourish. I’ve spoken about this connection previously, but, until now, I hadn’t fully connected the dots between how Flight Levels can be combined with the “aligned autonomy” matrix to provide a useful way to visualize where an organization is and how Flight Levels can help create a path toward aligned autonomy.

Aligned autonomy is the idea that, rather than the conventional view that alignment and autonomy are opposites, they are actually separate concerns. The insight is credited to the 19th-century Prussian military leader Helmuth Von Moltke and popularized in modern times by Stephen Bungay, who wrote in The Art of Action that

… there is no choice to make (between alignment and autonomy). Far from it, [Von Moltke] demands high autonomy and high alignment at one and the same time. He breaks the compromise. He realizes quite simply that the more alignment you have, the more autonomy you can grant. The one enables the other. Instead of seeing of them as the end-points of a single line, he thinks about them as defining two dimensions.

More recently, Henrik Kniberg, in his inimitably accessible style, expanded on the concept by describing the organizational culture types of each quadrant:

In the organizations with which I have worked, the elements of the Flight Levels tend to manifest themselves in particular ways that align to these quadrants:

Let’s take each one in turn.


This quadrant is where highest-level leaders make little to no distinction between “what and why” and “how.” This is the realm of top-down decision-making, going too far into detail about implementation and often handcuffing the people doing the work from making better decisions about execution because of a misguided centralized authority. Meanwhile, because of this obsession with the “how,” these leaders are often derelict in their duty to work “up a level” in strategy, where their contributions are most needed. As a result, these are leader-follower organizations, in which people are trained not to take action and have to ask permission for any decision of importance.

We might depict their Flight Levels as overlapping or compressed at one level; that is, people who should be developing and evolving strategy are too concerned with the day-to-day operations of teams. To use the flight metaphor, these are people who who be flying at the airplane level but can’t remove themselves from the butterfly-level details. Or, as Bungay explains: “Far from overcoming it, a mass of instructions actually creates more friction in the form of noise, and confuses subordinates because the situation may demand one thing and the instructions say another… trying to get results by directly taking charge of things at lower levels in the organizational hierarchy is dysfunctional.”

Von Moltke saw the same behavior in the military:

In any case, a leader who believes that he can make a positive difference through continual personal interventions is usually deluding himself. He thereby takes over things other people are supposed to be doing, effectively dispensing with their efforts, and multiplies his own tasks to such an extent that he can no longer carry them all out.

The demands made on a senior commander are severe enough as it is. It is far more important that the person at the top retains a clear picture of the overall situation than whether some particular thing is done this way or that.

These organizations find it difficult to scale effectively, because their leadership’s inattention to strategy and intrusive concern with implementation details creates a passive leader-follower culture.

The challenge for these organizations then is to use Flight Levels to encourage higher-level leaders to begin to distinguish between the “what and why” and “how,” and focus on setting “directionally correct” strategy while trusting teams and Level 2 Coordination to execute.


In this quadrant, the concerns of operations, coordination and strategy are variously overlapping, disconnected and/or non-existent. Here we observe:

  • Rampant and invisible WIP
  • Low employee engagement
  • No clear org vision/strategy
  • Siloed, undiscoverable tools
  • Tribal, network-based knowledge
  • Busy but unproductive people
  • Redundant, unshared work

Work in these organizations is perhaps best described by Barry O’Reilly when he says that “When people lack clarity they will optimize for what is in their control, output that is attainable to them but not necessarily the outcomes you want to produce.” To the extent that any measurements exist, activity-based metrics reign here.

The challenge for these organizations is perhaps to simply acknowledge the possible existence of Flight Levels and their relationship to each other. The simple but daunting task of making work visible is a necessary first step.


This is the realm of disconnected teams. They have broad autonomy but little awareness of their relationship to strategy and often of their relationship to each other and the wider end-to-end value stream. In some cases, they do have their own Level 3 Strategy, but they are not unified to a common organizational strategy; they function more as warring fiefdoms under a single name. Sometimes, this organizational culture is the outcome of growth, as we might see in the progress of a startup to a scale-up, in which leadership hasn’t commensurately matured with the new needs of the organization. But it can also occur in the context of a bloated Authoritarian-Conformist organization, whose strictures are too unwieldy to control and where leaders with some authority attempt to break free, making their own plans because it’s the only way they can get work done (e.g,. grey market of tools). In both cases, the work is disconnected from strategy. The organization lacks an ability to see itself from the 30,000-foot view.

People in these organizations generally make lots of decisions on their own, until the decision is somehow related to understanding strategy. Since leadership either keeps strategy closely held or, as is more often the case, doesn’t really have a strategy, this can cause tension, frustration and disengagement, as connection with higher-level purpose is missing. This often extends into career development, as well.

The challenge for organizations in this quadrant is to instantiate Level 2 Coordination and Level 3 Strategy. Starting points can be to identify desired organizational outcomes (Level 3), shift attention to end-to-end metrics (Level 2), make work visible and to use yokoten (lateral deployment) to create awareness.


This is the Leader-Leader ideal, which is fostered by clear delineation of concerns at the operational, coordination and strategy levels of Flight Levels. The lightweight but comprehensive modeling of these concerns in Flight Levels provides enough separation of “what and why” from the “how” for people to act autonomously but aligned toward the organization’s desired outcomes:

  • Intent is expressed in Level 3 Strategy in terms of what to achieve and why.
  • Autonomy in Level 1 Operational gives freedom of the actions taken in order to realize the intent; in other words, about what to do and how.

The “just-enough” strategy ensures that empowered teams and individuals are working on the right things, not merely working on things the right way. As one neutral observer of Von Moltke’s manifestation of the Innovative-Collaboration organization noted:

Every German subordinate commander felt himself to be part of a unified whole; in taking action, each one of them therefore had the interests of the whole at the forefront of his mind; none hesitated in deciding what to do, not a man waited to be told or even reminded.

— Art of Action


It’s important to note that many organizations aren’t monolithically characterizable in one single quadrant, nor do they always manifest in one quadrant over time. That is, certain groups in an organization may be Entrepreneurial-Chaotic, while others are Authoritarian-Conformist. Or while they may generally be Authoritarian-Conformist, they have moments when they exhibit Entrepreneurial-Chaotic.

As a result, it’s helpful to pay attention to those specific behaviors and move in the general direction toward aligned autonomy using Flight Levels, realizing the the organization may change in fits and starts.

Sponsored Post Learn from the experts: Create a successful blog with our brand new courseThe WordPress.com Blog

WordPress.com is excited to announce our newest offering: a course just for beginning bloggers where you’ll learn everything you need to know about blogging from the most trusted experts in the industry. We have helped millions of blogs get up and running, we know what works, and we want you to to know everything we know. This course provides all the fundamental skills and inspiration you need to get your blog started, an interactive community forum, and content updated annually.

How to Forecast Before You Even Start

One question that people who are friendly to the probabilistic-forecasting mindset often ask is “I understand how to forecast with delivery data once the project is underway, but how do I forecast before I even start?” Assuming that you absolutely need to know that answer at that point — heed Douglas Hubbard’s advice* — a simple probabilistic way to do it is through reference-class forecasting. Conceptually.org has as good a definition of it as anyone:

Reference class forecasting is a method of predicting the future by looking at similar past situations and their outcomes. Kahneman and Tversky found that human judgment is generally optimistic due to overconfidence and biases. People tend to underestimate the costs, completion times, and risks of their actions, whereas they tend to overestimate the benefits. Such errors are caused by actors taking an “inside view”, assessing the information readily available to them, and neglecting other considerations. Taking the “outside view” — looking at events similar to ours, that have occurred in the past — helps us reduce bias and make more accurate predictions.

An easy metaphor for reference-class forecasting is home sales. We’re trying to forecast something that’s complex — lots of market dynamics involved in something that’s essentially never been done before (how much someone will pay for this particular house at this particular point in time). We use a variety of economic and housing data — zip code, square footage, construction, features — to create a reference class of “comparables.” (If you really want to geek out, see Zillow’s Forecast Methodology.)

Most organizations have delivered some number of projects — maybe not this exact project in this exact tech stack with this exact team, but with attributes that are comparable to it.

An Example

Here’s an example. A company was considering a new initiative. They needed to know approximately how long it would take (time being a proxy for cost but also market opportunity cost). They took the traditional inside-out approach — attempting to predict how long something will take by adding up all known constituent tasks — and estimated it at about a year. This inside-out approach being subject to the Planning Fallacy, we decided to also try a reference-class forecast.

  1. We took a list of the 50 most recent projects, going back a few years. We needed only a pair of dates for each one: When the business officially committed to the project (confirming this commitment during the “fuzzy front end” is often the most tricky bit) and when it went to production.
  2. We then categorized each project by meaningful traits, like project type (legacy or greenfield), team size (small, medium, large) and dependencies (many or few).
  3. We viewed the data on a scatterplot chart.

Unfiltered Forecast

You’ll always have a tension between needing enough data (small sample sizes can be distortive) and relevant data. The good thing about reference-class forecasting is that it’s inexpensive (and better) to run multiple views. First, we ran an unfiltered forecast — all 50 projects.

Unfiltered reference-class forecast

This yielded a high-level view onto how long projects take overall. Half of the time (50th percentile), projects finish in 383 days, or a bit more than a year. But that leaves a lot of projects — the other half! — that take longer. How much longer depends on the level of confidence we seek:

  • 50% of the time: in 383 days (a little more than a year)
  • 70% of the time: in 509 days (1.4 years)
  • 85% of the time: In 698 days (nearly 2 years)

Filtered Forecasts

Of course, this new project will be different (as always!), so not all of those projects are really relevant. So we filter based on characteristics similar to the project we’re forecasting: It’s a legacy project, with a small team and many dependencies. We had such 12 projects in that reference class. Its confidence intervals are indeed different from those of the entire set:

a filtered reference-class forecast
  • 50%ile: 607 days (1.7 years)
  • 70%ile: 698 days (1.9 years)
  • 85%ile: 776 days (a little more than 2 years)

Those numbers were larger than the whole set, so that was disappointing. Maybe we can look at the problem a bit differently. What about legacy projects with many dependencies that were staffed by medium-sized (rather than small) teams. (Perhaps what Troy Magennis said about reducing the effect of dependencies with slightly larger teams was right!) We had 11 such projects:

Wow! That is quite a different story. I guess Troy was right!

  • 50%ile: 303 (less than a year)
  • 70%ile: 400 (a little more than a year)
  • 85%ile: 509 (1.4 years)

We now have three different reference-class forecasts to use. They at least give us some options to inform our thinking (especially as regards team-sizing decisions). Knowing which reference class to use is more art than science, so I like to consider a few options rather than locking into one (especially the one that paints the rosiest picture!).

Once we do get started with the project, we will of course want to do probabilistic forecasting with actual “indigenous” delivery data. But before we even start, we have informed ourselves from the outside-in — averting the Planning Fallacy — on when we might expect this particular new initiative to be done.

* Hubbard says essentially “Of course you need to estimate development costs when making a decision about IT investments. However, you don’t usually need to reduce the uncertainty about those costs to make an informed decision. Reducing the uncertainty about the utilization rate and the likelihood of cancellation of a new system is much more important when deciding how to spend your money.”

Strangler Pattern… for Estimating?

Martin Fowler long ago popularized the metaphorical “Strangler Pattern” (since updated to “Strangler Fig Pattern”) as a more graceful and less risky way to rewrite an existing system. He wrote of the Australian strangler fig plant:

They seed in the upper branches of a tree and gradually work their way down the tree until they root in the soil. Over many years they grow into fantastic and beautiful shapes, meanwhile strangling and killing the tree that was their host. This metaphor struck me as a way of describing a way of doing a rewrite of an important system… An alternative route [to all-or-nothing big-batch replacement] is to gradually create a new system around the edges of the old, letting it grow slowly over several years until the old system is strangled.

When introducing organizations to probabilistic forecasting — which I simply describe as answering the question “When will it be done?” with less effort and more accuracy — the move from traditional estimating can often seem like a similar problem to that of Fowler’s legacy application swap: It’s a big change, fraught with risk, affects a lot of people, and we’re not entirely comfortable with or sure about how it works.

For these reasons, and because most sustainable change is best effected via gradual, safe steps, I guide teams to apply what is essentially the strangler pattern: Keep what you have in place, and simply add some lightweight apparatus around the edges. That is, continue to do your legacy estimating process — estimation meetings, story points, fibonacci numbers, SWAG and multiply by Pi, whatever — and alongside that, let’s start to track a few data points, like commit and delivery dates.

Kanban Method encourages us to “start with what you do now,” and one of the benefits of this approach (besides being humane and not causing unnecessary emotional resistance) is that it helps us understand current processes. It’s quite possible that a team’s current estimating practice “works” — that is, it yields the results that they’re seeking from it. If that goal is to provide a reliable sense of when something will be done, doing the simple correlation of upfront estimate to actual elapsed delivery time will answer that question (spoiler: Most teams see little to no correlation). That in itself can help people see whether they need to change: It’s the Awareness step of the ADKAR model. Continuing existing practice while observing it also helps us decouple and filter out the stuff that is valuable, such as conversation that helps us understand the problem and possible solution. NoEstimates, after all, doesn’t mean stopping all of the high-bandwidth communication that happens in better estimating meetings.

Meanwhile, we’re collecting meaningful data — the no-kidding, actual dates that show the reality of our delivery system. These are the facts of our system, as opposed to how we humans feel about our work, and as one observer famously noted, “Facts don’t care about your feelings.” But facts and feelings can “peacefully” live alongside each other for a time, just as the strangler fig and host tree (before the fig kills the host, of course). You can then start running Monte Carlo-generated probabilistic forecasts in the background, which allows you to compare the two approaches. If the probabilistic forecast yields better results, keep using it and gradually increasing its exposure. If for some reason, the legacy practice yields better results, you may choose to “strangle the strangler.” Most groups I work with end up appreciating the “less effort and more accuracy” of probabilistic forecasts, and after a time start asking “Why are we still doing our legacy estimating practice?” At that point, the strangler fig has killed the host, and all that remains is to safely discard the dead husk.

So to summarize the strangler pattern for estimating:

  1. Keep doing your legacy estimating practice.
  2. As you do, track delivery dates (commit point through delivery point).
  3. Run a correlation between the two sets of numbers (upfront estimates and actual delivery times).
  4. Continuously run probabilistic forecasts alongside the legacy estimates.
  5. Check the results and either keep the new approach or revert to the legacy.

As with most knowledge work in a VUCA world, whether it’s coding a new system or introducing new ways of working, reducing batch size — of which the Strangler Pattern is a type — offers more flexibility and reduced risk. If you’re interested in a better way of answering the question “When will it be done?” but need to do so incrementally and safely, the strangler pattern for estimating may be an idea to plant. (Sorry, couldn’t resist.)

Agile-Consulting Radar Update

[In the spirit of the ThoughtWorks Tech Radar, I publish my own personal realtime Agile-Consulting Radar. In addition to expanding the blips on the radar, you can find more detail on each in my editorialized glossary.]

This update has a plethora of new tools — and a few old practices — to help organizations and teams work in our current remote-first — or is it remote-only? — and ongoing VUCA world. And it contains some holds — and even avoids! — for long-held agile sacred cows, like user stories and feature-based road maps, sure to provoke some raised eyebrows.

Ubiquitous Remote Work

For the last five months, organizations have been forced into remote work whether prepared or not, and the foreseeable future will require facility with ways of working that are fit for this purpose. This has predictably led to a burst of new tools into the market, many of which are worth a try, and perhaps unpredictably to a rediscovery of venerable practices, like the Core Protocols, Team Agreements, Personal Kanban, and Pomodoro Technique, that enable remote and asynchronous work. Our remote world also requires that we sense and respond to how our teams and colleagues are doing, so competencies like anzeneering and facilitation, along with metrics like Total Motivation and Engagement, are key.

Rethinking Conventional Agile

From velocity to user stories, backlogs to points-based estimation, no agile cow is too sacred to be slapped with an avoid label. It may be time to refactor your agile work processes with the original intent of the Agile Manifesto in mind. That includes an orientation toward value, which is why traditional feature-based roadmaps are out and outcome-based roadmaps are in. The agile community has perhaps finally come to grips with our environment, namely that we typically are working in complex rather than complicated domains, which is why competencies like Experiment Design and Systems Thinking and Sensemaking are blips.

Business Agility

As Klaus Leopold writes, “Agility of an organization is not about having many teams,” but “agile interactions between the teams.” To that end, the radar includes a few blips to support organizational or business agility. These are practices, like Operations Reviews, Leadership at Every Level and Flight Levels — and tools like X Matrix — that create aligned autonomy and connect action to strategy throughout all level of the organization. Speaking of all levels, the radar also contains a couple of items related to scaling, including guidance for unscaling or descaling.

It’s About Flow

Quite simply: Focus on practices, competencies, metrics and tools that enable flow: Flow management, Throughput/Delivery Time/WIP, iterations as checkpoints (rather than planning boxes) and blocker clustering. Avoid practices and metrics that inhibit flow, such as multitasking, high utilization and unlimited WIP.

user story card

Stop writing stories, start validating working software

Barry O’Reilly exhorts today’s leaders to “break the cycle of behaviors that were effective in the past but are no longer relevant in the current business climate, and now limit or may even stand in the way of your success.” After more than 15 years of writing, “refining,” “grooming,” estimating, documenting and coaching people about user stories, I believe it’s time to unlearn them.

I realize that’s a big statement and likely to be not a little controversial. But in a vast and sad irony,  user stories have become the heavyweight documentation and process that they were meant to replace.  Indeed, as Kent Beck shared, “Somebody came up to me and said, ‘We want to do software development but we just can’t stand all this ceremony and this Agile stuff. We just want to write some programs.’ Tears came into my eyes… How can it be that we’re right back where we were twenty years ago?”

Not unlike points estimation, user stories have taken on a life that I’m sure the manifesto signers never intended, and, as Beck alludes, was precisely what they were hoping to escape. As Martin Fowler has remarked, “a lot of what is being pushed is being pushed in a way that, as I said, goes really against a lot of our precepts.” My problem isn’t with user stories per se; it’s with how we use — push — them in practice. If we are dealing with an Agile Industrial Complex, then user stories are the weapons that convince us that we can’t be safe without them, that we simply need more of, and that themselves become the gauge of success rather than the things they’re meant to help us achieve.

But it’s amazing what happens when you strip away the accrued behaviors and get back to the heart of agile: I’ve been working on a side project building a web app with a friend, and I’ve been able to experience how refreshing and dynamic software creation can be when we free ourselves from the methodological fetters that we inherit and reinforce, of which stories are one (and one that I’ve been guilty of perpetuating).

In a vast and sad irony,  user stories have become the heavyweight documentation and process that they were meant to replace.

Curiously, user stories are nowhere to be found in the agile manifesto or its principles:

  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Responding to change over following a plan

Let’s focus on a few of the principles in particular and how I see the current practice of user stories as actually inhibiting agility and transgressing the spirit of the principles.

Working software is the primary measure of progress.

Quick: What metrics does your software delivery team and/or organization track? I’m guessing that nine out of 10 of you will say some kind of output-related metric whose origins are based in user stories: story points, velocity or even count of stories.

In my side-project experience, my teammate (who, by the way, is remote and in another continent) and I never even countenanced the idea that any of those story-based concerns would matter, so much that they don’t exist for us. We care chiefly about building something that people will use and enjoy using.

How might we focus on working software? To start, replace the daily standup — a Mos Eisley spaceport of anti-patterns rooted in stories* — with micro-demos (e.g., every day or every-other day). Then constantly validate that you’re building the right thing by letting people use it.

If you don’t think this is a problem for teams today, consider: How much time (meetings), effort and conversation do you and your team spend in the business of stories as compared to the business of validating whether you’re solving business problems and realizing outcomes?

Simplicity–the art of maximizing the amount of work not done–is essential.

Speaking of time, how much time in dehumanizing meetings,  Jira jockeying and estimation meetings have you endured under the guise of agile? Many organizations and teams have unnecessarily complicated their delivery systems with over-engineered story-tracking tools, incessant refinement meetings and elaborate processes. Try working without stories, and you might find you have a lot fewer meetings — and more time to do value-added work.

Business people and developers must work together daily throughout the project.

User stories, far from being the linking point between business and IT, have instead become another way to pass the buck onto IT without any business responsibility or involvement, creating a collective wedge of miniature contracts. IT needs involvement from the business? We’ll give you a product owner, who will in turn spoon-feed you requirements, er, stories. I think this crossing the business-IT divide is really the hardest part of agile delivery, but one that will be exposed when you remove stories. It will be a forcing function to align business and IT.

The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.

I, like many agile coaches, have for years taught the “Three Cs” of user stories: Card, Confirmation, Conversation. The card is a promise for a conversation! The only problem is that in practice, the card replaces the conversation. When stories take on a life of their own (e.g., in ticket systems), they become an impersonal interface not unlike the tomes of spec docs that we used once upon a time.

Conversation in real life is spontaneous, organic and natural. It flows from trustful relationship with and proximity (even if remote) to other humans. It’s possible to do this in delivery teams, but it’s inhibited when conversations are reduced to scheduled sessions and transactional instructions. And these “conversations” betray a mindset that says that the product owner knows best, and that developers need to seek their permission, rather than product people being comfortable stating the problem and what outcomes we need and letting smart people creatively solve problems. (John Doerr captures the essence of this in his awesome expression missionaries not mercenaries, which Marty Cagan elaborates on.)

The first value of the manifesto is about Individuals and Interactions over Processes and Tools. User stories and all of their trappings have become the epitome of process and tools. Let’s free ourselves to interact as people.

If not Stories, then what?

It’s actually easier than you might expect, though when you think about it, it shouldn’t surprise you. Software makers (and I use that term broadly to mean programmers, designers, quality advocated — anyone who enjoys creating software) are usually creative problem solvers. Give them the problem, and they’ll not only discover the solution, they’ll enjoy the experience of freedom, knowing the goal they’re aiming for. And they’ll create even better solutions that what we might’ve originally conceived (and codified on a user story).

Rather than define my point of view in the negative as only something you should not do (I’m not starting the #NoStories hashtag), I’ll propose what I think teams should do. All you need are some enabling constraints:

  • clear product vision and goals
  • trust and psychological safety
  • low level of work in progress
  • tight feedback loops

In the case of my side project, all three of those were in place:

  • Vision: Vision isn’t a one-time slide in a deck. My teammate and I talked when we started about what we wanted to accomplish and have reiterated and progressively elaborated on that vision. By living the product vision each day, you really can live without user stories, but it does take clarity from the product person and enrollment from the team.
  • Trust and safety: My friend (actually he was nearly a stranger when we started, but we have become friends) and I had an initial conversation to understand each other’s personal goals and interests. Ever since, we’ve been in regular contact and have reinforced the mutual trust and psychological safety — no punishments, no incentive plans, no minimum “average velocity” — nothing other than helping each other realize our goals.
  • Low level of work in progress: Without my permission, my teammate sets his plan of attack for the day, knowing that whatever he works on, he’ll focus on building someone that I can see. This is part of the creative freedom that developers need: Without the central planning of a product owner that is reduced to too-discrete pieces that often don’t sequentially make sense from a development standpoint, my friend is able to follow his creative instincts and engage in flow.
  • Feedback loops: Since my teammate is constantly building and releasing — and because he knows that I’m not expecting perfection on the first try — we casually and frequently exchange feedback, not only validating that we’re on the right track but also allowing what has been built to generate new ideas and learn from “happy accidents.” What we are creating is better than any user stories could’ve guided us toward.

I concede that it’s possible for teams to succeed with user stories (I’ve been on some of those teams), in which case this call to unlearn isn’t for you. But taking a clear-eyed view of the state of software delivery today, I’m concluding that, in general, user stories are inhibiting agility rather than enabling it.

I propose breaking out of the single-loop learning cycle that attempts to improve how we do user stories and, rather, taking a double-loop learning approach to questioning whether we should use stories in the first place. Free people to do their best work through humanizing patterns like high-bandwidth conversation, psychological safety and flow. This is how creative software building works.  I’m becoming convinced that people would enjoy their work more if we let them work in this way, and that ultimately organizations will be able to create the kinds of software products that are even better than they envision and realize the outcomes that they’re hoping to achieve in the first place.

* Story-based anti-patterns:

  • This story isn’t estimated.
  • There’s no way that’s a five.
  • Let’s schedule a meeting to refine those stories.
  • These acceptance criteria are in the wrong format.
  • The PO is out today; you’ll have to wait.
  • Can you update your cards, please?
  • Tell me what to type in the description.
  • Can you make a story for that before I do it?
  • Is this a bug or a feature?
  • This story won’t fit in the sprint.
  • I’m not sure we can commit to that.
  • I think we need a spike before we do that.
  • That’s a task, not a story.
  • We can’t work on that until it’s been refined.
  • Can you please add the details?
  • What am I going to accomplish today? Story #654 and maybe start on Story #679.
  • I can’t estimate that if I don’t know the solution.

Don’t mistake adoption patterns for maturity patterns

“Kanban is for teams that outgrow Scrum.”

“We can’t limit our WIP until we get used to how we work.”

“You can’t start using NoEstimates until your teams have stabilized.”

“Mob programming is an advanced agile practice.”

I often hear the above statements, and many like them, even from experienced agilists. Many in the agile community have a view of certain practices as “advanced” or “for mature teams only.” They seem to base that assumption on the sequence of how they’ve seen them adopted. But it’s possible that they’re mistaking adoption patterns for maturity patterns.

It’s a type of questionable-cause scenario (aka causal fallacy, correlation does not imply causation) that essentially says “Every time I see a team start using <practice X>, it’s after they’ve tried <practice Y>; therefore, it must be because they need to do <practice Y> first.” This makes about as much sense as observing someone who has been eating fast food all of his life and then switches to a healthier diet, and assuming that eating fast food is a necessary step to eating healthy. Like fast food, just because a practice is popular doesn’t mean it’s the best thing for you. Why not start healthy?

To be sure, some practices do require a level of competency before attempting, such as coaching or continuous delivery. But I encourage people to question their assumptions about what they think are advanced practices that might not actually be easier than the things they assume as necessary prior steps.

One such practice is probabilistic forecasting. Probabilistic forecasting basically uses your delivery data — how long it actually takes you to deliver work — and creates a range and probability at each point in the range to tell you when future work might complete. It requires no special meeting, no poker cards, no debates about what a “3” is. It doesn’t require you to “same-size” your stories, to have a certain level of predictability or even to be able to deliver work in two weeks. Yet because teams tend to turn to it only after despairing of their points-based estimating practice, it has a connotation of an advanced practice.

Why? Partly for the simple reason that few scrum trainers know what it is, much less understand how to do it. Plus, organizations that buy scrum and SAFe training tend to want to put into practice what they’ve been taught. This brings up another logical fallacy, the sunk-cost fallacy. But I don’t need a fibonacci sequence to estimate the number of times that I’ve seen teams get wrapped around the axle of relative estimation, fail to give their stakeholders a reasonable answer to the question “when will it be done?” and get beaten up (or beat themselves up) over how bad they are at estimation. (Spoiler alert: Everyone is bad at estimating in a complex environment.) To return to the fast-food analogy: Why wait until your arteries clog to change your diet?

Perhaps the all-time masquerader of “advanced” practices is kanban. As I’ve noted, kanban has many connotations; the “advanced” version is that of flow-based or “iteration-decoupled” delivery. The idea is that teams must first go through Scrum — with all of their replenishment, delivery, demo and retrospection cadences tightly coupled to sprint boundaries — and only then “graduate” to kanban. Quick show of hands: How many immature teams that you’ve worked with are able to build potentially shippable product in two weeks? Doing everything in a sprint doesn’t magically make all the work fit in one, especially if you’re accustomed to yearly releases.  The number of anti-patterns and prevarications that occur from starting teams this way (“QA sprint,” anyone?) is in my opinion one of the major reasons that “transformations” rarely work.  Far from being a basic practice, highly coupled sprints actually require simultaneous competency across a number of technical (e.g., real continuous integration) and business practices (e.g., thinly slicing user stories) to pull off. 

Another connotation of kanban — using work-in-progress limits to create a pull system — is perplexingly saddled with the advanced tag. Here’s how advanced WIP limits are: Agree to not start work any faster than you complete work. It’s one of the few practices in which you actually stop doing something rather than start. Again, simply because you didn’t hear about it until after a niche group started talking about it doesn’t make it any less worth trying now (See: Office Space).

Speaking of limiting WIP, how many organizations resort to big scaling frameworks before mastering that rather basic practice? (Hint: Limit WIP at the highest level of your organization and you may never have to install a big scaling framework.)

And the list goes on. Mob programming? I recently worked with a team new to agile engineering practices (they had little understanding of how to test-drive development) who decided that they wanted to mob program for a couple of weeks before going off on their own individual work. It worked like a charm and gave them an opportunity for fast, shared learning and the confidence to proceed. Far from being an advanced practice, mobbing required only the ability to take turns (something we all learned in kindergarten).

To borrow another analogy: Many parents assume that the best way to teach a child to ride a bicycle is with training wheels. That’s how I learned and that’s what is being sold, so that must be the best way! But many parents are finding that balance bikes are an easier way to build confidence and competency. Like training wheels, which “are counterproductive because children become reliant on them,” some agile practices create unhelpful habits or inhibit learning. As the agile manifesto encourages us, “We are uncovering better ways of developing software…” I encourage you to question your assumptions (or your consultant’s) about just how “advanced” certain practices are. Don’t mistake adoption patterns for maturity patterns. And don’t let anyone tell you that something potentially good and useful now requires you to be more mature.

NoEstimates, Applied

[A colleague recently posed the following rhetorical scenario about how to deal with a situation that presumably requires upfront estimates, to which I responded internally. I’m republishing it here, as I think it’s a common question, especially for service providers who are perhaps skeptical and/or misinformed about NoEstimates techniques and philosophy.]

A market opportunity demands that I have a gismo built by a certain date.  If delivered after that date the gismo development has absolutely no value because my competitor will have cornered the gismo market. If I miss the date, Gismo development becomes sunk cost. I’m looking to hire your top-notch development team.  Before I do I need to know the risk associated with developing gismos by the compliance due date to determine if I want to invest in gismo development or place my money elsewhere into other opportunities.  I need to understand the timeline risk involved in hitting that date.  Your team has been asked to bid for this work.  It is a very, very lucrative contract if you can deliver the gismos no later than the due date.  Will you bid? How will you convince me that I should choose your team for the contract? How will you reduce my risk of sunken cost of gismos?

Let me take each of these questions in turn, since each is important and related but independent of the others.

Will you bid?

This depends on factors like potential upside and downside. Given the uncertainly, I would — in the spirit of “Customer collaboration over contract negotiation” — probably try to collaborate with you to create a shared-risk contract in which each of us had roughly equal upside and risk shared. Jacopo Romei has written and spoken on this topic much more comprehensively, eloquently and entertainingly.

How will you convince me that I should choose your team for the contract?

This is relatively orthogonal (or should be) to the question about estimating, though the subtext — that the appearance of certainty somehow credentializes someone — is commonly conflated with ability to deliver. In a complex, uncertain world, you probably shouldn’t trust someone who makes promises that are impossible to follow through on.

I would “convince” you by creating a trustful relationship, based on how fit for purpose my team were for your needs. Though I wouldn’t call it “convincing” so much as us agreeing that we share similar values and work approach. Deciding to partner with my firm is a bit like agreeing to marry someone: We conclude that we have enough shared goals and values, regardless of the specifics, and trust each other to work for our mutual benefit. For instance, two people may agree that they’d both like to have children someday; however, if in the unfortunate scenario that they can’t, it needn’t invalidate the marriage.

How will you reduce my risk of sunken cost of gismos?

This is probably the question that most relates to the question of understanding when we’ll be done. In addition to the shared-risk model (see above), we have a few techniques we could employ:

  • Front-load value: Is it possible to obviate an “all or nothing” approach by using an incremental-iterative one? Technical practices like continuous-delivery pipelining help mitigate the all-or-nothing risk. Will delivery of some gismos be better than none?
  • Reference-class forecasting: Do we have any similar work that we’ve done with data that we could use to do a probabilistic forecast?
  • Two-stage commitment (Build a bit then forecast): Is it possible to “buy down” some uncertainly by delivering for a short period (which you would pay for) in which we could generate enough data to determine whether you/we wanted to continue?

Reference-class forecasting: Reference-class forecasting is “a method of predicting the future by looking at similar past situations and their outcomes.” So do we have any similar work that we’ve done with data that we could use to do a probabilistic forecast? If we do, I would run multiple different forecasts to get a range of possible outcomes. My colleague’s scenario is about building a software gismo, so if we’ve built other things like this gismo, we would use that past data. Maybe we’ve done three similar kinds of projects. The scenario is a fixed-date problem: Could we deliver by a date (let’s say it’s Christmas)? Here are some forecasts then we would run with data from previous work:

Project A 

Project A (forecast from https://actionableagile.com/)

Project B

Project B

Project C

Project C

Now we can pick our level of confidence (that is, how much risk we’re comfy with) and get a sense. Say we’re very conservative, so let’s use the 95th percentile:

  • Project A: at least 584 work items
  • Project B: at least 472
  • Project C: at least 891

So we have the range based on three actual delivered projects that we could do, very conservatively, at least 584 things between now and Christmas. Furthermore, we might also know that each of these projects required some number of work items (300? 2000?) to reach an MVP. That info would then help us decide whether the risk were worth it.

Two-stage commitment (Build a bit then forecast): Is it possible to “buy down” some uncertainty by delivering for a short period (which you would pay for) in which we could generate enough data to determine whether you/we wanted to continue? So maybe we’ve never done anything like this in the past (or, as is more likely the case, we have and were simply too daft to track the data!). The two-stage commitment is a risk-reduction strategy to “buy some certainty” before making a second commitment. Note that this approach protects both the supplier and the buyer.

In this case, we would agree to work for a short period of time — perhaps a month — in order to build just enough to learn how long the full thing would take. For the sake of easy math, let’s say it costs $1m each month for a team. Would our customer be willing to pay $1m to find out if he should pay $8m total? Never mind the rule of thumb that says if you’re not confident that you will make 10X your investment, you shouldn’t be building software. Most smart people would want to hedge their risk with such an arrangement. So this approach lets us run the forecast as soon as we’ve delivered 10 things, so here’s an example of the forecast:

Example forecast after delivering 10 items

Same idea; we are now basing this on the data from the actual system that we’re building, which is even better than the reference-class approach. Note also that by continually reforecasting using data we could decide to pull the plug and limit our losses at any time, whether that is a month in, two months, etc.

Better questions to ask

People propose rhetorical scenarios like this one with a certain frame. Getting outside of that frame is one of the first challenges of this new way of thinking. It’s not that NoEstimates proponents don’t care about when something will be done (rather, assuming that the information is helpful, it’s just the opposite — we use techniques that rightly portray non-deterministic futures) or are flippant with respect to cost. (For those times when I have come across that way, I humbly apologize.) Rather, we want to offer a different thinking model, such that we ask different questions in order to reframe to bring us to a more helpful way of approaching complex problems, like:

  • In what context would estimates bring value, and what are we willing to do about it when they don’t? (Woody Zuill)
  • How much time do we want to invest in this? (Matt Wynne)
  • What can you do to maximize value and reduce risk in planning and delivery? (Vasco Duarte)
  • Can we build a minimal set of functionality and then learn what else we must build? 
  • Would we/you not invest in this work? If not, at what order-of-magnitude estimate would we/you take that action? 
  • What actions would be different based on an estimate?

And when we do need to know some answers to the “When will it be done?” question, we prefer to use practices that give us better understanding of the range of possibilities (e.g., probabilistic forecasting) rather than reflexively take as an article of faith practices that may not yield the benefits we’ve been told about. That’s why I always encourage teams to find out for themselves whether their current estimating practices work: Most of the time — and I publish the data openly — we find that upfront estimates have very low correlation with actual delivery times, meaning that, as a predictive technique, they’re unreliable (Mattias Skarin found a similar lack of correlation in research for his book Real-World Kanban). It’s not that NoEstimates people are opposed to estimating; it’s that we are in favor of practices that empirically work.

My Favorite Pairing Partners

I’ve been reflecting on the many things for which I’m thankful, and, of course, that includes many people. I wouldn’t be where I am today — nor have enjoyed life nearly as much — without the numerous people who have paired with me, whether it’s been via software development, trainings, coaching, role-sharing, conference talks, etc. So here’s a gratitude in photo-gallery form — thank you to all of you (and many more not shown) who have taught, listened to, borne with, encouraged, supported, shared with, challenged, laughed with and otherwise made amazing memories with me. You have made me a better person, so thank you.

  • Torsten Leibrich (ThoughtWorks), Manchester, UK
  • Xavier Paz (ThoughtWorks), Santiago, Chile
  • Yasir Ali (Asynchrony), Los Angeles
  • Ryan Stephenson (Asynchrony), London
  • Luca Minudel (ThoughtWorks), London
  • Paul Ellarby (ThoughtWorks), Kansas City
  • Umar Akhtar (ThoughtWorks), Bonn, Germany
  • Ryan Boucher (ThoughtWorks), Cebu, Philippines
  • Anette Bergo (ThoughtWorks), Cebu, Philippines (with "Mitch," Krystle and Marlon)
  • Michael Fait (ThoughtWorks), Crawley, England
  • Vipul Garg, Bonna Choi, Kris Gonzalez, Saleem Siddiqui (ThoughtWorks), Seoul, South Korea
  • Malcolm Beaton (ThoughtWorks), Venlo, Netherlands
  • Cliff Morehead (ThoughtWorks), Seattle
  • Greg Jesensky (ThoughtWorks), Bangalore, India
  • John Yorke (Asynchrony), St. Louis
  • Patti Mandarino (ThoughtWorks), Chicago
  • Dimitrios Dimitrelos (Accenture), Athens
  • Karl Scotland, Paris
  • Jason Tice (Asynchrony), St. Louis
  • Lisa Smith, Denver
  • Roger Turnau (Accenture) and John Pinkerton, Kansas City
  • José Rosario, Prakriti Singh, Duda Dornelles, Esther Butcher, Michael Fait, Mridula Jayaraman, Mia Zhu, Péter Petrovics, Nishitha Ningegowda, Nelice Heck (ThoughtWorks University), Bangalore, India
  • Chris Turner (ThoughtWorks), Miami (and client friends)
  • Jim Daues, Matt Carlson, Jason Tice, Anthony Bruno, Lean Kanban St. Louis meetup organizers
  • David Kershaw (ThoughtWorks), Los Angeles
  • Hubert Shin (Samsung), Chicago-Seoul
  • Alexander Steinhart (ThoughtWorks), Berlin
  • Daniela Mora Herrera, Paola Ocampo, Susana Opazo (ThoughtWorks), Santiago, Chile
  • Xavier Paz, Mila Orrico, Andrea Escobar, Luciene Mello (ThoughtWorks), Santiago, Chile
  • Jeffrey Davidson (ThoughtWorks)

Leadership-Standup Questions

As more leaders and people in management organize themselves into teams working at coordination and strategy “flight” levels, they — like the operational teams they support — use daily standup/flow-planning meetings. While the goal is the same — to collaboratively plan flow for the day — the questions may differ, since the levels at which they’re working are different.

So what do standup meetings at coordination and strategy levels look like? Here are some questions that you may ask yourself and your fellow leaders:

  • What process,  meeting or organizational policy can I make more psychologically safe today?  One of the most important jobs for a leader at any level is to promote psychological safety. Hear the voice of Edwards Deming whispering in your ear: “Drive out fear, so that everyone may work effectively for the company.” (Sometimes, because of voice-silence asymmetry, that might actually mean not attending a meeting if you’re not sure that your presence will promote safety.)
  • Where are silos occurring?  Where are handoffs between teams and processes happening? When you’re a leader working at a coordination or strategy flight level, you have a wider-lens view on flow. With that vantage point, always be looking for organizational-refactoring opportunities that will lead to better end-to-end flow.
  • What failure do I need to learn from and share to set an example for others? If we want our colleagues at the operational level to be free to admit and learn from mistakes, leaders at other flight levels need to set the example. What public-service announcement might you write? What learning would you like to share with someone other than those at your same flight level?
  • Where has it been a while since I just actually saw where work was being done and value created? Leaders need to spend time “going and seeing” in a psychologically safe way so that they can actually remove friction in the lives of our colleagues. One group that I worked with kept a lightweight gemba-walk chart in their obeya and incorporated it as a regular part of their coordination-level standup.
  • What decisions am I planning to make that others could make? If that question makes you uncomfortable, ask yourself what is driving that feeling and perhaps fear that you need to be the one making those decisions. To use David Marquet’s two pillars for distributing control, perhaps you need to be clearer about your vision. Or perhaps you fear a lack of competency in your colleagues. What competencies are our colleagues needing in order to do what we’re asking them and that they’re aspiring to do?
  • Whose coaching invitation might I seek today? As modern leaders try to spend increasing amounts of their time coaching and pushing decision-making downward, they need to respect the need to be invited rather than force themselves on their colleagues. However, it’s possible to seek an invitation by being safely present and available (see the next question) and showing up as something other than a boss.
  • What meetings am I planning to attend that I may not truly be needed at, and how can I create more space in my day to be available to others? If you’re justifying your existence by attending meetings, it’s likely that you’re assuming too much decision-making authority or simply not providing much value to the organization. Moreover, you’re not going to be available when someone pulls the metaphorical andon cord and needs management support. Your time is better spent developing others; as Noel Tichy writes in The Leadership Engine, “The ultimate test for a leader is not whether he or she makes smart decisions and takes decisive action, but whether he or she teaches others to be leaders and builds an organization that can sustain its success even when he or she is not around.”

Do you even know what a kanban is?

Jerry: What happened to my stereo? It’s all smashed up.
Kramer: That’s right. Now it looks like it was broken during shipping and I insured it for $400.
Jerry: But you were supposed to get me a refund.
Kramer: You can’t get a refund. Your warranty expired two years ago.
Jerry: So we’re going to make the Post Office pay for my new stereo?
Kramer: It’s just a writeoff for them.
Jerry: How is it a writeoff?
Kramer: They just write it off.
Jerry: Write it off what?
Kramer: Jerry, all these big companies, they write off everything.
Jerry: You don’t even know what a writeoff is.
Kramer: Do you?
Jerry: No. I don’t.
Kramer: But they do, and they are the ones writing it off.
Jerry: I wish I just had the last twenty seconds of my life back.

— Seinfeld, The Package

Kanban is not exactly new to knowledge workers. It has been around since at least 2005. Yet even today when I hear someone say “yeah, we’re doing kanban” because they’ve decided to “get rid of iterations” or simply depict their requirements on digital cards, such as in Kanbanize or — may God help them — Jira, I feel like Jerry: “You don’t even know what a kanban is.”

I understand the reason for the confusion, though, as it has to do with the differing uses of the concept of a “card.” Kanban is roughly translated from Japanese as “signal card.” But a signal for what?

The first manifestation of kanban was in physical manufacturing, in which the card represented not the actual component or parts being built (like a tire or a box of screws) but a signal that the system had capacity to pull in the next batch of material. This “signal of capacity” was the key to just-in-time assembly, reducing inventory, improving flow and preventing overburdening of the system. (The card actually “recycles” itself back into the system.)

This is not a kanban
In intangible-goods delivery systems, the card is not a kanban.

In knowledge work or “intangible goods” (e.g., software) delivery systems, we also want to obtain those lean benefits. The problem arises from misunderstanding the purpose of the cards we use: The venerable agile user story is expressed on a card (either physical or digital) — it’s one of the Three Cs! But the user story is a signal of demand and not capacity. Thus any card that we post on our work board is more analogous to the physical part in a manufacturing line (or, to use a different example, the visitors queueing at a museum or botanical garden). We need a virtual kanban to signal capacity and create a pull system.

We create these capacity-signaling virtual kanbans usually in one of two ways:

  • Visual indicators of space (like an empty box)
  • Explicit work-in-progress limit signs (like WIP=2)

So in knowledge work, it’s not the card but the available open spots for the card that are the kanbans! Signals of demand — work that someone wants to be done — are powerless to realize the benefits of flow. Rather, the only way we achieve a pull system is to signal capacity. Otherwise, it simply can’t and shouldn’t be called kanban in any meaningful way.