Monolithic undertakings tend to fail. This may seem like a harsh statement, but there are many ways for systems to fail:
- a system could be built that simply doesn’t meet the requirements of the sponsor.
- a system could logically solve a problem, but exhibits behaviour that is outside of operational tolerance once deployed in a realistic context.
- a system could simply remain in development indefinitely and never finish.
When the “surface area” of the system is large it simply becomes more difficult to avoid all of these failure modes simultaneously. The odds of success diminish multiplicatively as more dependencies are added to the chain.
Some would argue that this is why we should simply never tackle any large undertakings. We should only ever engage in iterative refinement of systems, with the expectation that through some evolutionary processes we will expect to see the emergence of exactly the complexity that we want.
In the software engineering sector this has essentially been the underlying sentiment behind a number of the agile development approaches.
The issue is that we do need to tackle large complex problems, and we can not necessarily expect these problems to resolve themselves organically. Rather, we need to apply direct effort to understanding the issues and designing solutions. This might be for:
- building distributed scalable software systems that are fault tolerant
- developing a viable mission to colonise mars
- re-engineering a business’s internal processes and structures
- developing a nation wide system for dynamic cargo train scheduling
Thus, the meta-problem is how do we tackle large scale systems without them becoming monolithic projects that are predisposed to failure?
We can, in fact, take a page out of general systems design and architecture principles — modularisation. But, beyond flippant comments, it is a bit more subtle than that. And this is the subject of the remaining discussion.
Model the Whole
First, to immediately dispel the direction taken here, the intention is not to simply take a reductionist approach whereby the system effectively treated as being a flat space that can be broken up into a number of loosely coupled parts.
Rather, we work to build abstractions of the structure of the system. We do this by first identifying the boundaries of the system that we intend to understand and build — this could alternatively be referred to as the scope of the problem.
The first module that is identified and constructed is then a model of the complete system. This model will abstract away many internal details, but it is intended to provide full coverage of the problem space. This means that we should be able to demonstrate, at that level of abstraction of the model, that a system exhibiting the characteristics of the model does in fact solve the problem at hand.
Now, having a model to work with, we see that, in addition to helping us understand the problem space and predict the behaviour of existing systems, the model also helps to act as a scaffold to drive the development of a new system.
That is, forgetting about the full structure of the final system we can place our focus on simply building a reified version of the model.
This is the key to avoiding the monolithic project process. It makes it possible to:
- reduce the complexity of the problem to that of the high level abstract model.
- to create a bound for the work by making it easier to finish implementing the model, even though we have yet to build the complete system.
- use the model as a scaffold on which we can attach more details.
It is this last point that really provides the driver successfully building complex systems, since this is the point of recursion in the process. By having a model that covers the full problem space but intentionally defines a structure to the problem in which sub-models can be attached we have made the approach “scale free” (allowing for slight abuse of the term).
We can now, for the moment, treat the model as a completed effort and move onto more focused problems that involve implementing subsystems identified in the structure of the model. Furthermore, we can use the full context and imposed constraints to guide the design decisions when implementing these subsystems in the next level of detail. That is, we do not necessarily have to address all the complexity of a final solution, but rather we can prioritise elements that are crucial to satisfying the dynamics and mechanisms exposed in the upper model.
It is important to be aware of a few elements that the modelling approach provides us:
- the upper level model can actually act as a measure of completeness of the full undertaking as subsystems are implemented that provide the functionality required by their structural counterparts in the upper model.
- there is not necessarily a only one solution and design for lower level subsystems.
- the upper level model may, in fact, not be sufficiently capable of solving the original problem given lower level constraints.
It is here that we see the introduction of the need for iterative refinement in the process. That is, while the model essentially drives “top down causation” that is often missing in standard agile processes, there is still a need for “bottom up causation.”
So, we might see:
- subsystems being enhanced over time as the progress of the complete system imposes new constraints that negate the viability of the previous implementations of those subsystems.
- subsystems being completely replaced as the complete system progresses and highlights the need for completely new solutions.
- amendments and adaptation to the upper level model as constraints from the lower level system implementations percolate up through the system modelling processes.
- amendments or replacement of subsystems in response to maintaining alignment with changes made to the upper level model.
In the end, as we follow this model driven approach to developing complex systems we see that while we can’t necessarily avoid the full complexity of the problem we can certainly plot more viable and resilient paths as we search the solution space and explicitly drive the emergence of viable implementations.