Black Boxes and Boundaries


The premise of this post is that there is no such thing as a black box. David Parnas’s idea of data hiding, which he first described in 1972 has been taken to the extreme and we need to understand that hard boundaries are not necessarily what we humans need, or put it another way to use hard boundaries is to deny our humanity. We seem to have forgotten Postel’s law which stated that “an implementation should be conservative in its sending behaviour, and liberal in its receiving behaviour.”[1]


Early version of Microsoft operating systems were treated as black box with little documentation of the effect of a call. It was not till a book appeared that provided an idea of what the calls did that writing programs for the new operating system take off (there was also undocumented windows by Andrew Schulman which reflected the earlier work). This is fine if the interfaces are largely independent and there are no transitive dependencies between calls. If a call to A, affects B which in turn affects C we have a linear casual chain. If we have another sequence where D affects E and in turn E is dependent on C we have a transitive dependency. To understand how the interface will behave over time we need to have an understanding of this and if it is not documented you need to probe the object in an attempt to determine what the effect – they are casual in nature but there is no direct mechanism visible and therefore we cannot treat the system as a black box.  This is one of the issues with Smalltalk where everything is an object and therefore it has hard boundaries.

The other issue that arose is that the interface is the interface and you have to abide by the implied contract in order to be able to consume it. If you stray from the contract then you may experience unforeseen effects again. There was also the related issue that there was no formal language for describing these interfaces other than the printed documentation so a typo was not likely to be picked up. The interface constraint what was offered and if the functionality was basic then you may find yourself providing an level of abstract to make it simpler and more intuitive.

The SOA and ESB Era

We did not help ourselves with SOA (Service Orientated Architecture) and ESB (Enterprise Service Bus)[2] as they inherited these issues. We added the ability to formally define the interface via WSDL but these just enforced the hard boundaries. This meant that people could validate their code to ensure that syntactically they were calling the interface in an appropriate manner but they were expected to honour the contract (the interface). This presents basically a one size fits all and assumes that we can envisage all the scenarios that are likely to consume the interface.

This meant that we needed to change the interface if there was a minor change in the usage scenario and what about large change? We developed elaborate scheme for versioning of the interface so it could change over time but often the rate of change was insufficient in some instances.

We experienced a similar issue with ESB as the data was tied to a standard structure (later we realised that the trick was to use a canonical structure) which meant that we then needed to map the data to each consumer and each of these had to be written. Later offerings provides the means to undertaken mapping at the boundaries but this relied on the canonical structure offering appropriate containers for the information.

We also explored the ideas of orchestration and choreography which were intended to provide more flexibility as we could have a service that orchestrated others or one that combined a number of services at a level of the system architecture to provide a new service. These just build more services that had to be learnt and make the system more fragile – if an underlying service failed what was the effect as all these other services expected their contracts to be honoured?

Complexity Perspective

If we look at this from a complexity perspective we need flexibility as we can never foresee every situation. That is we should expect change and need a paradigm that supports this. Alicia Juarrero uses the phrase ‘sloppy fit’ to stress this, which is similar to Postel’s law. This allows an element of adaptability as the interface is not hard and it can support some degree of ambiguity. We need the boundaries to be ‘loose’ so that we don’t have to redefine or extend the interface every time there is a minor change in the usage scenario. The interface needs to be flexible and acknowledge that this is a sociotechnical problem and not just a technical one. This ‘sloppiness’ promote resilience through adaptability and flexibility. The alternative is to accept that the interface will fail (not meet needs) at times and provide a means of early detection and fast recovery.

We do not seem to be very good at temporal aspects of systems so causality is something that we need laid out. This means that the boundaries need to be flexible or permeable so that we have an understanding of what happens – from a complexity perspective a black box is the last thing we need. The open source community has been useful here as it has allowed people to look ‘inside’ the box and work out what is going on. In some instances they may have a better way of achieving the same thing and it also supports tailoring of the service to the current content, not some arbitrary scenario.

The third point is that we have assumed a fail-safe mentality in building these system architectures. From a complexity perspective failures will happen and the system architecture should be safe-fail and not fail-safe. The later assumes that we can foresee all the failure scenarios whereas the former assumes that things will fail that we cannot foresee and therefore the system should be built to be tolerant.


These are things that I believe micro-services has the opportunity to address as they make the interactions explicit and allow the interface to reflect the consumer’s requirements. This allows for flexibility as it not necessarily a one size fits all model. It also allows for evolution of the interface over time (although we need to manage the diversity this is getting easier as the tooling improves). What we do need to take note of Postel Law and try to ensure that the intereface supports some degree of sloppiness.

Because the dependencies are visible it makes it easier to consider the implications of failure and start to make system safe-to-fail and not just fail-safe. This leads to a more resilient architecture that should degrade gracefully and make containment of failure easier.

The issues that were inherent in the SOA are not completely addressed by micro-services but they a large step in the right direction acknowledging that we need more flexibility, transparency and design for safe-fail. The interfaces are simple (or potentially simple) and therefore cause-effect can be easily determined and understood – in addition we can make the interface a bit sloppy (this can be applied to most interface design) so that we have some slack.


This is not an argument that everything should be developed as a micro-service – for example have split their monolithic system architecture into 5 parts and this provide sufficient isolation and flexibility for them. The point is the next time someone tells you that we should just treat it as a black box don’t take the statement at face value.


[2] I’m glossing over remoting and attempts like DCE to provide structure approaches to remote objects

One Organisational Backlog? Or Two?

I have been working with a group on whether they should have a single organizational backlog or whether they should separate their backlog into a product backlog and a technical improvements backlog. Like many places they want to make sure that infrastructure and technical debt is prioritized properly.

Any practitioner will tell you that technical improvements generally get ignored in favour of product improvements. Technical debt pay down normally gets deprioritized when there is a chance to improve the product from the customer’s perspective. At one organisation I worked with, the CEO said that thirty percent of every team’s capacity should be focused on improving the quality of the product. Every quarter the CEO would review where the teams were focusing and discover that product improvements dominated, normally taking one hundred percent of the effort.

In theory a separate backlog for technical improvements makes sense. We assign a certain percentage of capacity of each team to technical improvements and then we work through the technical improvement in the order in the backlog. However, in practice, we know that the teams will work almost exclusively against the product backlog. When you have two backlogs, you have the problem of deciding how the priority of each item in the technical improvement backlog relates to the priority of items in the product backlog.

The only solution is to have one backlog containing both Product and Technical Improvements. The investor group that prioritises the backlog considers the relative importance of each item regardless of whether it’s a product improvement (i.e. customer / business benefit) or a technical improvement. That way it is clear to teams that the technical improvement is more important than that new feature that the client wants.

So where does the confusion in Agile circles come from? It turns out that SAFE advocates the use of two organizational backlogs, a product backlog and an architectural backlog. The authors of the SAFE framework must know that two backlogs create this problem that leads to excessive technical debt, so why do they advocate this approach. My theory is quite simple. SAFE advocates the use of a particularly bad implementation of Weighted Shortest Job First (WSJF) to prioritise the product backlog. WSJF contains a bias that favours backlog items where the outcome is known. In Cynefin terms, WSJF has a bias towards “Obvious” and “Complicated” items, and against “Complex” and “Chaos” items. Technical backlog items often (though not always) fall into the “Complex” and “Chaos” domain. My theory is that the authors of SAFE have seen this problem in early implementations of SAFE and so separated out the Product and Technical Backlog.

Practice has shown that one backlog is needed if technical items are going to be given the appropriate level of priority. So how do we do this in practice? The real problem is that technical improvements are often expressed in terms of cost and the “What” / “How”. Technical improvements are rarely expressed in terms of the benefit they will deliver. To do this, two additional organisational level metrics are required, one for customer perceived quality (functional bugs, performance bugs, UX bugs, availability etc), and one for the lead time to deliver value (e.g. weighted lead time for investments and lead time from detection to fix for bugs). Technical improvements can be expressed in terms of these metrics (e.g. Paying down this technical debt will reduce the uncertainty of lead time for this change to the component, or this item will reduce the probability of bugs). The outcome is often unknowable which means they are in the “Complex” or “Chaos” domain, however the investor group can understand the intent. Once the intent is known, it is possible to construct a narrative that explains the value of the technical item in the context of other product investments. It is then possible to construct a backlog where WSJF may assist the prioritization discussion but does not dominate it.

Another couple of related points.

  1. Paying down technical debt is a great way to train new people on a code base and reduce key man dependency (a form of technical debt). If the organisation knows it will be making major changes to a particular component, investing in the pay down of technical debt is a great way to prepare for that future development and build additional capacity so that it can be done quicker.


  1. A more important point is that having to put technical debt into the organizational backlog is a transitional state. It is necessary because the teams allow the build up of technical debt and see the need to pay down debt in large chunks. In mature Agile teams, the teams will gradually improve the quality of the code base as part of every piece of work that they do. For example, a few years ago I visited Nat Pryce on a project. He showed a graph that his team had created. The graph showed the number of lines of code in their application. When they started it contained one million lines of code. After six months it consisted of one hundred thousand lines of code. They had not taken time out to pay down of technical debt, rather it was a continual process alongside developing new features. On a mature team you are less likely to see backlog items to clean up technical debt because it is a continual process that is part of normal development. In other words, a mature Agile culture will clean up as they go along whereas a immature Agile culture will have teams that see a choice between delivering features and creating technical debt. It would appear that SAFE intends to embed this immature practice in its process by having a separate technical backlog.

In conclusion, a separate technical backlog is a failure state. A technical backlog institutionalizes immature practices and creates a separation between product and technical concerns when there should be no split. A second technical backlog hides technical concerns from the product organisation when they should be a primary concern. Instead, create Quality and Lead Time based metrics that allow engineers to communicate the importance of the work they need to do.

One backlog to rule them all, not two or three or more!

Agile Tools don’t fix problems, they reveal them.

Sometimes we are lucky to work with someone whose insight into the work we do has a profound affect on the way we see what we are doing. One such person is Tony Grout who changed the way I look at Agile. “Agile doesn’t tell you how to write software” Tony said. “Agile provides you with the minimal tools that show you were something is wrong, tools that show you where you have a problem.” As Dan North would say Agile practices illuminate or visualise problems, they don’t necessarily solve them.
Over the past few weeks I have been working with a client to set up Capacity Planning and Metric Mapping (the name for the practice of linking a hierarchy of metrics using hypotheses). I’ve done both a few times now so I was able to take a step back and observe what was going on. Both Capacity Planning and Metric Mapping show you problems to be solved. Neither tell you how to the solve the problem.

Unlike some Agile Scaling Frameworks, Capacity Planning does not tell you how to make decisions. Capacity Planning does not care whether you use HiPPOs , or business cases or some messed up version of Weighted Shortest Job First. All Capacity Planning tells you is that you have to make a decision. This allows the organisation to evolve how to make decisions. It allows the organisation to take “safe to fail” steps with change, cultural, process or otherwise. All Capacity Planning does is show the problems, it does not dictate the solutions.
What are the problems that Capacity Planning reveals?

  1. It is common for the product owner to have one-to-one conversations with different stakeholders/decision makers. Each stakeholder applies pressure to the product owner to get what they want. When the product owner and team fail to deliver they are blamed even though the problem was caused by unrealistic expectations. I call this the hub and spoke model. Each stakeholder (who may be a product owner themselves) individually applies pressure on the product owner who has to resolve these organisational priorities. The Capacity Planning process puts all of the stakeholders in a discussion together and takes the product owner out of the middle. The product owner now simply states the capacity they have available and the effort necessary to perform each piece of the stake holder’s initiatives. The stakeholders are now forced into a conversation where they have to work out the priority of the work amongst themselves. Capacity Planning does not tell you how to resolve the priorities but it means everyone needs to think of the organisation’s goals rather than just their own.
  2. Capacity Planning reveals if the organisation’s reward mechanisms and appraisal system creates behaviour that are directly in conflict with the goals of the organisation. The organisation wants the most valuable work to flow through the constraints. The reward system drives individuals to achieve their own personal goals rather than the goals of the organisation. Once again, Capacity Planning does not tell you how to fix your reward system, it just highlights the problem.
  3. In order to run a Capacity Planning session it is necessary for each initiative owner to get a Sweet Wild Asses Guess (SWAG) estimate from any team that will need to contribute to the initiative. Capacity Planning does not tell you how to come up with a SWAG other than to warn against putting too much effort into a throw away piece of work. The SWAG indicates that product owner has acknowledged the existance of the work and feels confident enough to give an estimate. In effect, The SWAG is a record that the owner of the initiative has had a conversation with the product owner of the contributing team. Capacity Planning forces the conversation. It does not say what happens in the conversation and it does not say how the estimate should be calculated. The SWAG simply provides the Capacity Planning facilitators with a mechanism to track the conversations. The SWAG forces organisations to have conversations before initiatives start rather than when they are mid flight.
  4. One of the outcomes of Capacity Planning is a list of the teams that are constraining the organisation because they do not have enough capacity. It also shows the teams that have excess capacity.Capacity Plan Capacity Planning does not tell you how to move “work to the teams”, or “teams to the work” (Another Tony’ism). It simply shows you where you need extra capacity and where you already have excess capacity. There are many solutions for “moving the work to the team” or “moving the team to the work”. (The Bank of America Case Study offers one of the more interesting solutions as it is more dynamic that others than mandate long lived teams.)

Capacity Planning makes it clear that the ability of an organisation to deliver over the short term (three months) is based on the capacity of the constrained teams and its staff liquidity, and has little to do with the budget for the organisation as a whole.

In summary, Capacity Planning shows you where your stakeholders cannot prioritise, it shows you where initiative owners are not communicating with the teams that they need to deliver value, It shows you where capacity is in the wrong place. It does not tell you how to solve these problems, it just makes them transparent.

Metric Mapping is the process by which a Product Manager picks three or four key metrics to demonstrate the success of their product. They agree these metrics with their manager. The manager also has three or four key metrics. The manager has a portfolio of (product) metrics and they have a hypothesis that if their product managers improve their metrics, in turn their manager metrics will improve. The product managers have hypotheses that if they improve some aspect of their product, their metrics will improve.

Metric Mapping requires engineering managers to have metrics. Work that is done to improve the “non functionals” of the product. The head of engineering should also have a couple of metrics. Ideally one for customer perceived quality, and the other lead time based (actually duration or weighted lead time).

All work should be mapped to these metrics. This allows everyone to have a portfolio view of the investments made by the organisation and adjust investment strategies accordingly. Metric mapping does not tell you what the metrics should be. Metric Mapping does not tell you how the metrics should be chosen. Metric Mapping ensures that there are conversations between the different levels of the organisation. These conversations ensure consistency and coherence between investments and what the organisation considers success to be. The portfolio view shows where there is too much or too little investment. It does tell you how to fix this problem.

Metric Mapping allows everyone to see whether investments are coherent, and whether metrics between different levels of the organisation are coherent. For example:

  • A product manager says they are investing to reduce web site page load times to improve customer satisfaction. This is coherent.
  • A product manager says they are adding adverts to the web page to improve customer satisfaction. This is incoherent.
  • A manager has a metric to improve customer satisfaction and so agrees with their product manager that one of their metrics should be call centre call length times. This is incoherent

Managers should set goals based on their metrics. They should not set goals based on their subordinates metrics. Furthermore, they should never tell the subordinate how to improve a metric unless the subordinate asks for their advice. They can of course point out where investments to move metrics are not coherent, and they can help them to understand how to move metrics as a coach.
Metric Mapping helps organisations to see problems and incoherent investments. It allows everyone in organisation to identify investments and hypotheses that are not coherent. It does not tell you how to solve the problems.
And so Capacity Planning and Metric Mapping align with Tony Grout’s observation about Agile. Both show problems but don’t tell you how to solve them. The solution depends your context. In your context, you may form a release train to plan your work for the next quarter. Or you may bring everyone in your department into a room and ask them to self organise. Or you may use a tick list. Depending on your context, one of these solutions might be good or bad. There are however some solutions that are less likely to work than others…. We Shall Just Forget (WSJF) those.

Gaming of Weighted Lead Time

This post is in response to Kent McDonald’s excellent question on the Weighted Lead Time post. The question deserves a longer response. Kent asked… “What are some of the behavior changes you have seen from teams or organizations when they started paying attention to this metric?”

I spent over two years at Skype working on metrics at the organisational level, especially operational metrics. I learnt two key lessons:

  1. All metrics will be gamed. In fact Robert Benefield, an expert in game theory, gave the following advice. “All metrics will be gamed, when you design a metric, start with the behaviour you want and then create the metric so that when it is gamed, you get the behaviour that you want”. A variant of lead time is a great example. The easiest way to game lead time variants is to create smaller units of work which is exactly the behaviour we want.
  2. The other thing that was etched into my memory is that any individual metric can be gamed. As a result, it is necessary to create a system of metrics that provide constraints to prevent gaming.

Coming back to Kent’s question. Weighted lead time can be applied at three significant levels:

  • Team: This should not be used as a metric. Each team will attempt to locally optimise which will lead to higher weighted lead times for initiatives.
  • Initiative: This is the metric that should set for each team. Each team should be the average of the weighted lead time of all initiatives they are part of.
  • Organisation: It is harder for teams to impact this directly but it can be gamed.

There are a number of ways that weighted lead time can be gamed, the most obvious are deliver work with no value, or to deliver a low quality solution. The product metrics should ensure delivery of value. It is important that the organisation has an effective quality metrics from the customer’s perspective (A huge subject). Given that the value and quality are not gamed, how else could a team game the weighted lead time metric?:

  1. They could avoid being part of initiatives that are cross team and likely to take longer to release value. This is actually the kind of behaviour we want. We want teams to find the simplest solution with fewest dependencies “Everything should be as simple as possible but not simpler.”. It needs to be carefully monitored during the Capacity Planning session, i.e. monitoring of the “but not simpler” rule. Once again, product metrics are key to ensure the initiatives are effective.
  2. Teams work on initiatives in the wrong order. They might prioritise initiatives that only they are working on to improve their WLT. As Capacity Planning produces an ordered backlog, we were able to create a “wrong order-o-meter” to see if teams were working on things in the wrong order. We weighted the effort they engaged in based on the initiative’s order in their backlog. A high score did not mean the team had the wrong behaviour, it simply indicated that someone should have a look and understand why the team was working on initiatives in the wrong order.

So Kent. the answer is that the metric can easily be gamed. You need an eco-system of metrics and processes and informed people to make this stuff work. Sad to say, its not a silver bullet, just another useful tool for the toolkit.


Weighted Lead Time

Duration is a technical term from financial bond mathematics. Its not a great name. Weighted Lead Time (thank you to Dan North) is a more expressive name. Weighted Lead Time does not replace lead time and cycle time as measures, its a compliment. However I will argue why I think its a useful metric at all levels of the organisation.

  1. A CEO level metric that any financial astute investor (Private Equity, Venture Capital) will understand.
  2. Super easy to calculate. The second half of this blog will show you how to do it in Excel.
  3. A metric that makes sense when you look at it from different perspectives. It can be easily broken down and analysed from any perspectives to provide insight.

A CEO Level Metric
Weighted Lead Time (WLT) takes all the cash amounts invested over a period of time and replaces them with a single cash flow at a single point of time. An investor understands that if nothing else is known, an investment that takes a long period to generate a return is more risky than an investment that takes a short period to generate a return. An investor understands that an organisation where the WLT is increasing is becoming more risky (and possibly more risk averse) whereas an organisation where WLT is falling is becoming less risky.

Obviously it is easy to game this metric by failing to improve product metrics which is why it is only one of the metrics that an investor would look at.

The Super Easy Metric
WLT is super easy to calculate. Consider the following story level extract from a tracking system for three teams “red”,”blue” and “green”. The three teams have been involved in three initiatives “XFactor”,”NewMute” and “Xcaliber”.

Time Tracking Extract

Three of the fields are “calculated”, Investment, time and time*Investment.

Investment (I) is the cost of delivering the story. There are many ways to calculate this, such as the percentage of the sprint (calculated using story points or the count of stories). The worst way is to ask all team members to enter their time into a time tracking system as this will lead to all sorts of unnecessary pain and gaming.
The units for this do not matter, Man days, Team Weeks or Dollars or Euros all work.

Time (T) is the number of days from starting the story, and the date the investment delivers a return. Even though software may be released into “production”, that does not count. It is when the value is released that counts.
Instead of the start of the story, it is acceptable to use the start of the sprint when the commitment of investment occurs.

time*Investment (T*I) is Time multiplied by the Investment.

Simply highlight the data and create a pivot table in Excel. Three pivot tables are shown below:

WLT Pivot Tables

In each case the SUM of the Time*Investment and Investment are shown in the body of the pivot table. Weighted Lead Time (WLT) is calculated as the SUM(Time*Investment) / SUM(Investment) in units of days (The investment units cancel out which is why we do not care what they are).
The first pivot table shows the WLT of each initiative and the Grand Total (119 days) shows the WLT for the whole organisation (which is also on the other two).

The pivot below shows the WLT for each team.

The pivot table on the right shows the breakdown of each initiative by team.

The key metric is the WLT per initiative. The WLT per team is useful analysis tool but setting it as a target can lead to behaviour that leads to improving the WLT of the team at the expense of the Initiatives. The team level metric should be the average WLT per initiative for all the initiatives they are involved in.

**WLT from Different Perspectives**
As can be seen from the example above it is easy to see WLT from different perspectives. It can be seen and easily understood from the initiative, team or whole organisation perspective.
Compare that to Lead Time or Cycle Time. The lead time and cycle time for initiatives change as the configuration of the investments change. Lead time and Cycle Time cannot be aggregated to an organisational level value or broken down by team in a simple way.
Lead time and Cycle Time are great metrics but they really assume a manufacturing paradigm where the nature of the work is static.

WLT is not a replacement to Lead time and Cycle time, it is a complement. WLT is super easy to calculate, and a nifty metric that can be used in many contexts and communicated to the investors.

Strip Maps and Risk

These days if we want to get from one place to another we have a number of choices available to use. When driving in the car, I tend to use SatNav. I use SatNav even for familiar routes because it is fairly reliable at rerouting me when poor traffic conditions block the way ahead. Just this weekend I went on my weekly pilgrimage to see my children, and SatNav took me off the motorway to partially avoid a traffic hold up. Sometimes my SatNav throws a wobbly. There are a few points in my regular journey where I have come to learn (at my previous cost) that I should ignore the SatNav. These are normally because the road has changed since my SatNav was installed (and the map is not updated). Without the traffic updates, I probably wouldn’t bother with SatNav at all most of the time. I prefer to check out my route in advance using google maps, and use the map on my phone when I get close to the destination.

Before electronic maps we had the familiar A-Z Atlas in the car. Everyone I knew had easy access to an A-Z of London. You had to update it every couple of years but in reality, we would get the tube to the closest point and follow directions from there.

Even in the world of SatNav and perfect maps, we have to exercise caution. Even in the world of perfect maps, the maps don’t tell us everything. Maps don’t tell us which routes are safe or dangerous, whatever that danger might be. Trusting too much to a map is dangerous. In cities like London, with its twisty roads formed by generations of cows and sheep herders, it is easy to get lost and ending up pointing in completely the wrong direction. Even with a map, people in towns may find a compass useful sometimes.

Before comprehensive maps, people used strip maps. Particularly in Medieval times, these maps were used to help people travel along popular route such as pilgrimage trails. This is an example of a strip map.

Medieval Strip Map

If you were going from London to Jerusalem, the strip map was pretty helpful. But what if you were starting out in Paris, or Stockholm, or Beijing? Then the strip map was of limited use. You may have another strip map that converged with a point on the Strip map to Jerusalem but you have no guarantee that its the safest or easiest journey from where you are starting out.

Last week I discovered the strip map metaphor to be particularly useful to explain to people about Scaling Agile. Rather than tell them we have a SatNav that will get them to Jerusalem, I explained that we were on a journey that few if any had ever been on before. There were Strip Maps (experienced individuals and experience reports) that could help us with parts of our journey but we had to be aware that we may come across obstacles that others have not described yet. In Cynefin terms, it activated people to a heightened state of awareness so that they would challenge the route rather than simply follow it without thinking. For team level Agile we have SatNav supported in five dimensional cyberspace. For organisational level Agile, we have a few strip maps written on vellum.

For Scaled Agile, we are still mapping out the territory. In fact, we are still determining the dimensions of the territory. Culture is one dimension, as is the Cynefin domain. Scale changes everything so that’s probably going to end up as a dimension as well. As will the business domain. As the saying goes, if you want to get to Jerusalem, I wouldn’t start here.

There are some successful strip maps. If you want to get from Croydon, New Jersey to Jerusalem you can follow a Safe route. In Cynefin terms, if your company is in the over constrained “Obvious” domain, then Safe will help you migrate from DSDM or RUP to the true “there is only one way” “Obvious” SAFE route to the Jerusalem Hotel in Nevada just a few mines south of the place that features at the end of “Thelma and Louise”. The danger is if you happen to live in Meinehatten, Germany, following a SAFE route might lead to more danger than finding a local explorer with experience of a similar journey to Masada. Make sure the explorer has an extensive network of fellow explorers to reduce the risk.

We are also aware that following a strip map blindly can lead to real pain and heart-ache. There are whispered rumours of a Leming Organisation following a religious visionary over a cliff (I prefer to think of that visionary as the blind man in “The Life of Brian” who was blind but now can seeeeee.)

Strip maps are a great metaphor for the state of Scaling Agile. Strip maps encourage a “Safe Fail” OVER “Fail Safe” mentality. Before we move forward, lets make sure we can get back to safety. As a metaphor it helps organisations understand that they will need lots of strip maps. They will also need compasses, and sextants, and models and beliefs (theory), and they will also need climbing ropes for those experiments that are not “Safe to Fail” enough.

My gratitude to Martin Burns for introducing me to strip map metaphor after I wrote about “Scaling Agile being off the map”.

Understanding Uncertainty’s impact on Learning.

In Commitment, Olav and I wrote that the “rational” order of preference is:

  1. Right
  2. Uncertain
  3. Wrong

We also wrote that the observed preference for most people is:

  1. Right
  2. Wrong
  3. Uncertain

People would prefer to be wrong than uncertain. We know this because they make commitments earlier than they should, and destroy options unnecessarily. A preference for uncertainty would result in more learning. A preference for learning would be an advantage within a “Community of Need”. An aversion to uncertainty is an advantage in a “Community of Solutions” as you can be more compelling and forceful in your argument about a solution. You can be more certain.

We learn something new when we perceive it has value for us.


It is stressful to be aware of things that are valuable that we do not know. If we crave certainty, we ignore the value of these things. We can express value using this formula:


We can considering this value in Kolb’s “Circle of Learning” ( a.k.a. Feature Injection’s “Break the Model” ).


“Spot the value” means we spot an observation, Ix ,such that the value Vx, calculated using our value model

V = F ( Io … In, Ix )

is different to the value observed. In effect, we have an arbitrage situation. The difference between the observed value and the calculated value are different. We should act accordingly, learning and applying the new idea.

This seems fairly straight forward. It becomes tricky because our value function F ( Io … In ) acts as a filter on our perception of reality. The more certain we are, the less likely we are to differences between two things that reveal a problem with our value function. The more we are comfortable with uncertainty, the more like we are to identify differences.

This is very important to understand the impact of uncertainty on learning. If we are certain, we will not notice the problems. Its not a case of we ignore then, we simply will not perceive them as important. In fact, its when we notice these observations that do not fit our model that we are at danger of shifting from “Obvious” where we are certain, to “Chaos” where we see lots of observations that do not fit our model and the model effectively collapses as it is no longer of use.

Knowledge Options

Learning new things takes a lot of effort. Twenty hours of concentrated practiced has been suggested as a minimum to become barely sufficient in a subject. More effort is required to become proficient and significant effort is required to master a subject. “Community of Solutions” value masters of a subject. They will often be proficient to a level where they can make a tool work even though it is not a natural fit for the context. People in a “Community of Needs” need to understand the value of an idea, the context in which it is best applied. They also need knowledge options. A knowledge option involves knowing the value of an idea and having an option to acquire proficiency in the skill within a certain timescale. You can learn more about knowledge options in the “The Lazy Learner” talk. The most powerful option is to be a member in many “Community of Needs” with access to practitioners who can guide and help you learn quickly.

Organisations hiring Agile Coaches should always hire people from the “Community of Needs”. Trainers can come from either community. If you hire a coach from the “Community of Solutions”, they will attempt to solve your problem with their favourite solution, regardless of whether it is the best solution. They will attempt to demonstrate it will work in all contexts, even if there is a better tool for some of those contexts.

And here in lies the tragedy of the situation. Organisations attempting an Agile Transformation are seeking certainty. Their value model does not allow they to value a coach using option thinking, i.e. how many knowledge options they have to  acquire skills to solve a problem. Instead they value coaches based on their expertise and their popularity. Until Organisations learn to value coaches properly, they will continue to suffer failures in their transformations.


Get every new post delivered to your Inbox.

Join 81 other followers