This article aims at answering a set of core questions about software architecture, providing answers that come from modern software architecture thinking. Its inspiration came from:
Conversations with Rebecca Wirfs-Brock and Simon Brown
Architecting the eventrix.co product, running "Architectural Katas"
Countless conversations with architects and developers at international conferences
The article is structured in a Q&A format, allowing for easier reading. Feel free to skip to the questions you find more interesting, and to read it in any order you like. It's best to start with the first answer, since it impacts the understanding of the whole article. Happy reading!
There is no clear definition of software architecture. Take for example Wikipedia:
"Software architecture refers to the fundamental structures of a software sys- tem, the discipline of creating such structures, and the documentation of these structures"
This definition has a few issues. What are the "fundamental structures" of a software system? What are the "structures" of a software system?
We propose a better definition:
"Software architecture refers to making a set of strategic technical decisions related to a software product, documenting them and ensuring their implementation"
Strategic decisions are those decisions which:
affect more people
for a longer period of time
Let's see a few examples that clarify what strategic technical decisions mean:
Example 1: The choice of programming language for an application is a strategic decision. It will affect not only the whole development team but also the recruitment team. Once a programming language is chosen, it's very likely it will be used for years. If the need to switch to another programming language arises, it will require a lot of money and effort to make the switch.
Example 2: The choice of a framework is a strategic decision. Same reasoning applies as above.
Example 3: The choice of a microservice architecture style (or REST, or mono- lith etc.) is a strategic decision. Once a few thousand lines of code have been written using the chosen architecture style, the cost of changing to another style will be high because it includes: training all team members, modifying or rewriting existing code to adapt it to the new architecture, maintaining different architecture styles for a while etc.
The decisions that are not strategic are called "tactical". The obvious example is writing a private method in a class. While it might affect some people, it is very easy to change. Similarly, writing classes that are not exposed to other modules is a tactical decision. This article will use the term "software design" to designate these tactical decisions.
Why does someone need to make these decisions? Can't they just happen? What is their goal?
You can certainly make strategic decisions on the go. The problem is that doing so might be very inefficient.
Let's imagine you start an application with your favorite programming language. Soon, more people join the team, and each of them wants to write code in another programming language. It's definitely possible, but it might lead to bugs, risks (if the Haskell programmer gets sick, who will change the Haskell code?) and additional costs. It's an extreme example, but it shows that certain decisions affect more things than others. If you want to optimize the development, these decisions need to be made and implemented.
Therefore, the goals of software architecture are:
Build a product that works within the required quality constraints
Reduce development costs
The most difficult thing is making strategic decisions (that is, decisions with a long-lasting impact) with incomplete information and knowing the context.
Some people will say that microservices solve this problem; actually following a multilanguage microservices approach results into paying a continuous cost. It's a tradeoff, it does not avoid the core problem could change at any time.
We often joke saying that "architecture is where code meets the real life". Code is neat because it gives the same result under the same conditions. Once the application is deployed, unexpected things start happening: servers fail, users find strange ways of using the application, spammers and hackers attack it, solutions that worked on paper and in controlled environments are inadequate in the production environment etc.
This is why architecting requires a specific type of mentality, one that is analytical enough to find the holes in the system before there's a problem, but that at the same time is not stopped by incomplete information, one that can balance short term needs with long term possibilities, keeping the development time short for each feature while investing just enough into flexibility for the future, one that can present eloquently the technical choices to both technical and non-technical people.
It's not really the technical part that's difficult, although there's a lot to learn there as well. The most difficult part of software architecture is the mindset.
The quality constraints of software architecture differ from product to product. This is why they need to be clearly expressed and, whenever possible, quantified. Let's see some examples:
Correctness (important in core banking, embedded medical software etc.): identify the areas of the system that need to have as close to 0 bugs as possible
Performance: time to response for requests to functionality X has to be under 1s
Availability: the system will be available 99.9% of the time (so called "one nine")
Scalability: the system will perform with the required performance even when the load is of 10.000 requests / second
Robustness: a failure in one of the modules will not expand to the whole system (subset of availability)
Resistance to errors: a miscalculation in a part of the system will be caught in another part of the system
Architectural decisions can have a huge impact on development costs. Here are some examples:
Use 3rd party services or libraries instead of developing certain features
Follow consistent conventions throughout the code. For example: all APIs use REST; never throw exceptions from module APIs etc.
Reduce the cost of fixing developer mistakes through a clear automated testing policy
Reduce the cost of investigating bugs through a clear logging and error handling policy
Use specific database servers, HTTP servers etc.
The development of a product is subject to multiple types of risks. Here are a few examples:
Availability of developers: developing in Haskell might be nice, but there are very few developers. Should we choose Java instead?
Team risks: what if one third of the team gets sick (or moves to another job)? Can the others continue at the same speed?
Human errors: what if someone is very tired and introduces a serious bug? Can we catch it early?
Security: what if an attacker tries to steal data from the system? Can we prevent it? Can we find out while they're trying? What's our response to a successful attack? (aka contingency plan)
While some of these risks seem to be part of other people's job (project man- ager, HR etc.), it's important to realize that strategic technical decisions can affect other aspects of the business. Such effects have to be discussed with the appropriate roles from the business and decided together with them.
Security is a set of risks that impacts all developers. Preventing these risks is a matter of training, guidelines, even specific design patterns to be used in the code. It also requires ensuring that they are actually used in development, through periodic reviews.
Lack of knowledge can be dealt with in a few ways: hiring a specialized consultant or investing in learning. In both cases, the result should be a working prototype that proves a certain feature can be implemented in a certain way. If the problem requires a short learning time, using a spike (time-boxed period for creating a prototype, for example 2 days) was proven to be a good approach.
When architecting software, what are the building blocks we use? The basic blocks are:
Modules
Modules are parts of the system that can be replaced with other implementations. To allow replacement, they need to have clear protocols and contracts. Protocols define how modules communicate with each other, including the correct responses for different types of requests. You can imagine the software system like a living organism whose organs are the modules, and protocols are the circulatory system and the immune system.
In this context, a protocol is not the transport mechanism, although it can build on top of one (the most common transport protocol nowadays being HTTP, but countless other exists). Here are some examples of things that should be part of a protocol:
Always call "initialize" before calling other method from the module
All calls to the module are synchronous / asynchronous
In case of error, the module returns error codes / throws exceptions / returns error messages
This description points to the fact that a clear protocol goes hand in hand with a clear API for the module.
There are a few ways to deploy a module:
as a namespace inside a larger package (e.g. a Java war/jar, a .NET as- sembly, a C++ lib / dll etc.)
as a runtime library
as a plugin
Each of the deployment options has advantages and disadvantages. Briefly, the more packages you have to deploy, the more complex the deployment, monitoring, debugging and operations become. The simplest way to deploy is a monolith, and the most complex is microservices.
larger applications require additional building blocks: subsystems and systems
Fortunately, there's a way to keep your options open and introduce complexity late: start with a modular monolith and extract libraries, plugins or services only when needed.
Making the right decisions is not enough. The development team and the leadership team need to be aware of them understand and implement them.
The best ways to communicate these decisions to the development team are:
code samples
diagrams
documents
giving a presentation
Some of the documents that should be part of the architecture but are very rarely created are:
Design guidelines (specific patterns to use in specific contexts, for example to prevent security issues or to speed up development)
A logging and error management policy that applies consistently to all modules
Testing policy (what is automated, what types of tests are used, code samples)
The leadership team ideally follows the "three amigos" pattern:
One business person, who cares about the financial results of the product
One product person, who cares about the user's happiness
Presenting the decisions inside this team requires a different language and a different level of abstraction. While developers are focused on why and how to implement the decisions, the leaders care about the risks, the choices and the effects upon the product as business.
The marketing team often has this list, but marketing rarely discusses with architects
Same thing applies when you have separate testing & operations roles, teams or departments
This article has tried to present the main ideas in modern software architecture thinking. Each of them deserves a longer discussion, and there are things we haven't touched. We can only hope it helps clarifying and separating fashion- able ideas from the hard ideas that will still be useful 10 years from now. We will let you be the judge of that. Please contact the authors for any question, clarification or remark you might have. Thank you!