Development of large scale distributed systems is complex. Many of our software systems essentially follow this distributed system pattern, and are therefore complex undertakings.
Complexity takes time. It takes time to evolve in nature, and it takes time to introduce in software. Complexity also introduces risk. This risk can generally be sufficiently mitigated with the inclusion of additional complexity so as to reduce the likelihood of systems failures – but then even more time is required.
Now from a business perspective, time is money. But equally importantly it reduces the rate of feedback for experiments. Therefore, increases in time give one fewer opportunities to learn from mistakes and choose new courses of action.
In short, we should be working hard to reduce the complexity of the systems we implement.
Alas, it would seem that:
The road to complexity is paved with the good intentions.
Often in software development we see that implementations pull in much more complexity that is immediately required:
- We may have software projects that depend on surprisingly large numbers of third-party libraries.
- We may have software where the communication protocols used have overheads to “future proof” the interactions.
- We may have software that inherits historically decoupling rather than being refactored to keep the structures simple.
So, the question is why do we see this behaviour, wherein there is actually more complexity than is immediately required?
My sense is that in many cases this occurs because the developer is genuinely, and appropriately, attempting to avoid effort, either immediate or future, and thereby improve the utility of the software being developed. For example:
- pulling in dependencies is a valid way of achieving code reuse and building
on other peoples labours.
- However, it can also unduly increase the number of different tools that future developers will now need to acquaint themselves with before they can work on code. It can also unduly increase size of deployment binaries, bringing in new technical problems that need to be addressed. Or, it can lead odd incompatibilities between transitively shared dependencies.
- “future proofing” protocols can be a huge gain in productivity when the next
iteration of system can now dodge difficult solutions that are now catered
for by the protocol.
- However, such protocols are often more difficult to work with and therefore are more onerous for other developers to use. Additionally, the problems solved by the protocol’s additional machinery may not actually be problems that need to be solved.
- avoiding refactoring of existing structures can lead to a large time savings
since the code base does not need to be altered.
- However, leaving these structures in place may actually increase the overhead for interacting with said systems. Whereas, while there may have been effort in merging or restructuring subsystems, it might have paved the way for simpler implementations in surounding code and reduced effort in future deployment of the systems.
Essentially, we can see that the additional complexity has possibly crept in because the developer was attempting to reduce the complexity along one implementation path, while not fully considering the other paths that would be affected by this decision, either now or in the future.
The difficulty is that there is no silver bullet. The impact of these choices needs to be considered on a case by case basis.
However, there is a guiding principle that can help to reduce the occurrences of these types of choices – that of working from the ‘general’ to the ‘specific.’
General to the specific
If we engage in software development as an iterative process rather than big bang process then we can afford to delay certain details until they are needed. In some sense, “just in time development.”
The issue is to know when you are leaving off a detail that actually should be handled, or when it is safe to ignore the details.
We can take a leaf out of the book of figure drawing or other artistic undertakings. The guidance provided here it to work from the “general” to the “specific.” That is, always aim to render a complete picture in the sense that it covers the complete subject, but work on this in layers. First sketch a very rough version of the form of the subject. Then work to increase contrasts across the picture, and finally focus on the details of particular features.
In software development the same process can be applied. Always aim to develop a complete version of the system in the sense that it can be seen to cover the full range of the problem domain. However, the early implementations may avoid dealing with particular edge cases or use cases that would be required in the future.
For example, a proof of concept of a business process may reasonably ignore:
- horizontal scaling - since the proof of concept should not expect to be used by a high number of users
- resilience in failures - since the proof of concept can reasonably run with a much lower SLA than real customers might expect
- edge cases handling failures - since the proof of concept is reasonably demonstrating only the common use cases
Now, this is not to say that we don’t need all of these details. But, rather that our design should explicitly leave gaps for these aspects. Part of the design decision then becomes working out what can be left out, in a way that it could be introduced “just in time” – while still being careful to not build something where the omitted detail, that was identified early, is actually very difficult to introduce later.
As such, we begin to see that the software system is a dynamic and evolving system. It is not a single entity that is finished in one step, but rather a growing system that needs to adapt smoothly to its context as more complexity it added.