Software Architecture Today
Jan 2024 - Alex Alejandre

Humor me a digression to mold the cognitive landscape/conceptual space: Seniors may skip this. I just describe engineering and abstraction.

Engineering is the “application of mathematics and [science] to the needs of humanity.” To this end, a design process presents iterative steps to aid decision making, where engineering solutions optimize multivariate functions for considerations like cost, ethics, environmental impact etc.

Software engineering is the art of …well, systems engineering offers the real guidelines to tackle a large interconnected system’s complexity. taming complexity, for the sake of maintainability. This emerges from software’s very nature: Software is automation, performing discrete concrete actions humans would have in the past. When an AI can replace us, our profession’ll have reached apotheosis.

Developing software is R&D. Well, it was. We research the problem space and develop a solution. To engineer a solution, we efficiently and effectively wield the resources at our disposal. Now, this isn’t a jab at “developers” but a reminder that there’s more: Fr. ‘suivi’ combines ‘follow up’ and ‘monitoring’ development, operation and maintenance. Accounting for the whole lifecycle makes it engineering.

Maintainability’s an eternal struggle, for the problem space constantly evolves, users demand ever more features etc. and the solution must keep step with its dance partner. But the solution space changes too. Remember, our software automates tasks, but many tasks have not been automated. Our coworkers, different departments, vendors etc. do them. The entire business context which led to the solution may change.

We know Turing complete languages are equivalent, yet it’s more impactful to use one with the right primitives and features. Architecture’s the same way: proper organization tames the problem space, letting you abstract away accidental complexity, arriving at a simpler, clearer problem, thus a simpler, clearer and more maintainable solution.

All solutions are alike; every problem is problematic in its own way. We can normalize problems into happy solutions with polynomial time reductions and see they’re equivalent, merely different representations like ‘2’, ‘2+0’, ‘2.0’ etc. Our problems, organizational (office politics) …microservices! and organizational (code layout), often behave the same.

Culture influences the variables engineers optimize for (implementation speed (up front cost), profitability, maintainability and extendability (long term cost), stability, SLAs, coolness for resume driven development, scalability, usability (UIX, documentation), compatibility, reliability, (legal/industrial) compliance etc.)

With software we build mosaics out of small stones statements like x = y - 1, if x < y etc. Reduce individual components' interconnections, complexity Combining them, ever bigger pictures form as trees turn into forests. Once we have gotten far enough, Sometimes lossy. we can abstract away previous complexity, offering a set of larger bricks/structures. To return full circle to Wiktionary, Software architecture is “The set of structures needed to reason about a given software system.” These pieces and primitives should communicate intent, because maintainable code is communication (ordering information for the audience of future developers.)

Idioms, patterns, traditions should help new contributors get up to speed, without Microservices & Event Driven fail here, requiring you to find the producer/consumer across the whole codebase requiring inside, tribal knowledge. (mythical ‘self documenting code’)

Building architects have concepts like kitchen which imply a sink, counters, space for a fridge and stove etc. What do we have?

Event Driven Architecture

Components communicate by sending and receiving events (state changes). There are 3 parts: The names are self explanatory.

  • producers
  • consumers
  • event bus/broker

Some interesting things:

The Sega pattern’s very similar. similar things were common in finance and aerospace in the 90s An eventual consistency protocol, with Sega you don’t need to coordinate transactions across DBs, but create an ordered set of transactions which can be backed out with compensating transactions if business logic rules are violated, by passing events between entities, converging multiple state machines into a consistent state.

This doesn’t exactly address infrastructure, UI or much of anything though. A given set of discrete components a la “Kitchen”, “Bathroom” etc. can be analyzed and arranged with this lense, but they aren’t named.

MVC - Model View Controller

Fine, in its original domain of software with real time read/write views with the same data (e.g. CAD, Blender). But our domains have changed a lot and it e.g. doesn’t fit the web, so I will focus on the extended versions Or distorted, no longer really MCV actually deployed today. Fundamentally:

  • M: data, business logic
  • V: presentation layer
  • C: handle user input, update model

Because this only considers UI, extensions appear to address infra:

Phoenix' MVC is extended so: I’ve just dabbed with Elixir so far

  • M: schema & context, create a data structure and validations (changeset) then define data transformation and persistence
  • V: view & template, parse data and show it in template
  • C: controller & router, get request then serve with funcs/pipelines from context then make view render it

Another variation is Controller Service Repository (CSR) which adds layers. The repository and service patterns do make sense and dependency injection aids testing and early development. For example:

  • C: user input, flow between parts
  • S: business logic (not user input, not data storage). intermediary between C and R
  • R: interact with data storage (DB, APIs, files)

Uncle Bob opposes Javaesque controller/model/service directories, prefers ‘Screaming architecture’, his term for putting important information in front and burying implementation details, to flatten onboarding and increase discoverability a la literate programming. (Though this naming convention applies everywhere) Here’s a lazy sketch with MVC inside 👹:

./screaming_bank
    loans
        mortgage
            create
                controller.go
                serializer.go
            models.go
        car_loan
            create
                controller.go
                serializer.go
            models.go        

tl;dr: model the domain, use explicit domain names in folder architecture e.g. customer

MVC and its kin face these problems:

  • hard to isolate components for unit test
  • Cf. Go: ‘accept interfaces, return structs’ i.e. 1 way communication tight coupling due to bidirectional communication
  • hard to scale
  • controllers prone to bloat

Onion architecture

Use concentric layers with different purposes, inspired by DDD.

  • core domain: domain entities, business logic = main functionality
  • domain services: services supporting logic, often stateless
  • application services: orchestrate interactions between domain entities and services Like a controller
  • infrastructure services: DB access, external integrations etc. Implemented as interfaces or abstract classes
  • external dependencies: injects dependencies with concrete implementations of infrastructure interfaces/classes, DB, frameworks etc.

The core principles are:

  • Dependency Inversion Principle (DIP): the core domain (high level) doesn’t depend on low level, both depend on abstractions
  • Isolate Concerns: Each layer’s isolated Hopefully non leaky abstractions with dependencies flowing inward

Optimally, you can unit test domain/business logic ignoring other layers!

Hexagonal architecture

This focuses on where we define interfaces. If a service layer depends on a repository layer’s interfaces, you’ll have a bad time. When we define repository interfaces in the service layer, but implement them in the repository, we achieve dependency inversion. We call interfaces ports and implementations adapters. Some just throw the adapters in an infrastructure package and say they’re doing hexagonal architecture!

tl;dr: Put logic in the center, surround it with ports and adapters.

  • ports and adapters
  • separate business logic from external dependencies and infra
    • by organizing code in layers
    • logic in center, adapters etc. surround it

“Ok, but why a hexagon?” you ask. Graphics use a hexagon for more space, that’s all

You can have notification concerns which you address with a port Port = interface and implement outside with a Telegram bot adaptor. You can have 10 more ports with 20 adaptors each. So don’t think about it.

Some issues:

  • overengineering, ports/adapters introduce boilerplate for small projects
  • though test focused, the port/adapter abstractions can make it difficult to test certain interactions, but mocking them is easy!

Adapter = implementation Chorus: Accept interfaces, return structs In short, interfaces should be defined where they are going to be used, not where they are going to be implemented.

A layered (onion) architecture + interfaces leads us to:

Clean Architecture

tl;dr: clean reveals intentions through well named and partitioned code

This post or this lecture sums up the whole concept succinctly.

the circles are schematic. You may find that you need more than just these four. There’s no rule that says you must always have just these four. However, The Dependency Rule always applies. Source code dependencies always point inwards. As you move inwards the level of abstraction increases. The outermost circle is low level concrete detail. As you move inwards the software grows more abstract, and encapsulates higher level policies. The inner most circle is the most general.

Clean architecture is SOLID + dependency inversion.

In a layered architecture, every layer depends on the one inside. In a clean architecture, there are no layers, but a central domain/core which everything depends on it. The concrete implementation depends on the interfaces defined in the domain. = Services define their input (port) interface, which the adapters implement.

Are Onion, Hex etc. the Same?

They answer two kinds of problems:

  • How to decouple domain logic from tech stack/infra?
  • How to control changes in infra?

These models all scratch at the same concepts and problems appear when people try to implement every detail where not relevant. When e.g. “clean code” results in more code, it can still be useful, e.g. allowing you to move logic from a server to CLI quickly. (This also aids testing because it’s easier to test domain logic (no mocking), if it’s already abstracted.) If this trade off is valuable to your usecase, introducing this architecture early makes sense.

They do this with one principle: Build layers from high (logic) to low (implementation), without higher levels knowing about the lower.

The top/inside contains the domain logic, which does not care nor know if output goes to a txt file or web page, if it’s stored in a txt file or SQL DB. The lower layers connect the pure domain logic to the real world (side effects). DIP.

In Go

So how do you split a Go project into suitable layers? Ben Johnson’s post was a good start, but now we finally have an official layout. Further:

Start abstractions from the most generalizable entity, typically the service layer.

Martin Fowler’s Analysis Patterns introduced yet 70 more patterns in 1996. Many say Go doesn’t need architecture at all. They mean SOLID, gang of 4 design patterns etc. were bandaids for missing features in Preemptive classes. shudder Java:

  • Visitor pattern: Implements type switching
  • Singleton pattern: implements global variables Go provides exportable package scoped vars
  • Command and strategy patterns: Emulate first class functions with closures
  • State pattern: implement state machine with first class functions In Go you just make one function per state

Go’s primitive elides such complexity and makes best practices/principles easy. The underlying principles still apply and will give you a clean, decoupled, refactorable codebase. Define routes as top level functions, using structs to serialize data or communicate with the frontend.

  • Declare interfaces in the package that wants to accept the interface as a parameter
  • Don’t return an interface from a function
  • Go packages provide, not contain things - Bill Kennedy talk
  • Higher level doesn’t import from lower level

The two ends of the idiomatic Go spectrum are go-kit and upspin. go-kit abstracts separate endpoints, transport, service etc. while upspin’s handler funcs are directly attached to endpoints. Both separate domain logic from other concerns. go-kit’s is vertically split in cargo, voyage, location, routing packages, which the inmem package implements, all divided into 3 services (tracking, handling, booking). upspin’s logic is in the upspin package whose abstractions feature packages (cloud, store, access) implement, all divided into handlers. go-kit is SOLID, useful for changing requirement spaces while upspin’s is simpler. This simplicity is fine for the stable requirements demanded of public APIs etc.

Coming from traditional OOP, the simpler approach seems especially valid as DIP seems to demand a bit extra code. Mockable structs depend on interfaces, which means interfaces for data sources, clients, domain logic etc. But if layers require noticeably more code, there’s an organizational issue and extra design time can Less code is better shrink the codebase. Interfaces make dependency injection natural. Define clear interfaces, then implement them elsewhere to tautologically decouple the implementation from your logic. Remember:

Accept interfaces, return structs. Amen.

Go’s favorite refrain is the Dependency Inversion Principle in fewer syllables. Indeed, “ports and adapters” is an even shorter incantation. Interface climbing: Cast to a more performant type in case it’s implemented e.g. io.ReadSeeker, Error to StackTrace(). Appreciate how io.Reader and io.Writer are defined in io, just assuming something satisfies one, then build on it (e.g. io.Copy.)

A domain, repository (DB) and an API layer or Contract. If with an x-doer, I can… interface (remember Hexagonal ports are interfaces, it’s all the same for us!) are a good starting point for organic growth. No package straddles them (they connect by interfaces). Cf. Screaming architecture increases discoverability, decreases overengineering. Protip: Avoid the following words in your code: service, controller, repository, util, helper, domain, adapter, port, clean nor design patterns by name.

Your domain isn’t just a package, but the entire application. Domain abstractions should be found (implemented, not defined) everywhere in all your packages, because almost all code is an adapter or port. Feature driven development naturally organizes concerns by usecase. New features expand the domain with new interfaces. (Some like to add a network layer.) SOLID, clean, DDD etc. emerge as a side effect.

Small, composable interfaces are the UNIX way.

Keep your package structure shallow.

Big Name Concepts

Here are some more ideas and principles:

Imperative Shell Functional Core

Pure, functional do logic code, wrapped in mutable stateful code with side effects. Onion/layered etc. from a functional perspective.

Domain Driven Design (DDD)

Unfortunately wallowing in the OOP mire (basically a pattern book), interesting ideas are approached but covered by too much abstraction and overcomplication. Data/concept modeling as a whole fell victim to UML, consultants etc. I’ll preach on this in the next article.

  • Core: Business logic changes state, makes a record, applies rule applied. No io. Pass data/structs to func
  • Build shell to fetch data, apply core logic etc.
  • Package by business context (specifically named packages)
  • Collaborate with subject matter experts (SME)/domain experts, which forces you to name Valuable. reduces huge source of bugs. things concretely, no “domain” folder etc.

A lot of abstraction:

  • repository
  • object
  • entity
  • aggregate

DDD breaks a large model into bounded contexts, explicitly describing how they interact/interface.

https://github.com/warrant-dev/warrant follows this. Each domain has its own package, which the same building blocks (handler, spec, service, model, repository) from req to DB

Microservices

grug wonder why big brain take hardest problem, factoring system correctly, and introduce network call too

Rare in practice, because few actually have a separate DB for each microservice. Most are just distributed monoliths.

  • Split domains into many groups
  • Manage system complexity: Not quite architecture as you can use Hexagonal for the microservices
  • Solve organizational problems: From a technical perspective, they just add Complexity is cost. complexity (at almost every project’s scale)

Requirements change at different speeds, so services solving them have different release velocities. Optimally, this means a team’s domain gets its own microservice and release speed. Dependencies become self contained (besides specific relationships like scraper->DB). Orchestration/K8s is difficult.

12 factor app

I charitably generalize outdated specifics in accordance to their use today:

  1. single codebase, many deployments (dev, staging…)
  2. declare and isolate dependencies go build, go test, go run should have the same result
  3. keep configs outside of code Originally ‘config in env’ but practice prefers json, yaml, cmdln flags etc. outside of containers
  4. consume dependencies
  5. automate setup declaratively: create prod with recreatable stages (build, release, run), so you can rollback. Different builds, releases etc. need unique identifiers. Different releases have same build with different configs.
  6. stateless services, sharing nothing
  7. isolate services' data, share with API
  8. scale horizontally (without tooling, arch changes)
  9. disposable services (for elastic scaling)
  10. minimize difference between dev, staging prod (same stack, code, people)
  11. logs are event streams, storage, routing logic etc. belong elsewhere
  12. update with scripts, no manual changes, Unique snowflakes are pets, not cattle for reproducibility

This aims at a higher scale, composing different instances and services together. Most tenets are actually followed today.

SOLID

tl;dr: meh

  • single responsibility principle
  • open-closed: extendible, not modifiable
  • Liskov substitution: Funcs should use base or derived classes without difference
  • interface segregation: Only depend on interfaces you use
  • Dependency Inversion Principle / Depend on Abstractions

This lays bare how “design patterns” and principles are band aids for missing language features. Go’s interface and method design inherently satisfies OLI. Keep them small for S. We have covered D. Dave Cheney

Uncle Bob on S and when to ignore DRY:

Gather together the things that change for the same reasons. Separate those things that change for different reasons.

Screaming Architecture

Name things after their domain (ticketing, not “domain”). Important information goes in front,implementation details later.

Literate Programming

Holy SICP starts:

Programs are meant to be read by humans and only incidentally for computers to execute.

Code should express the problem eloquently and succinctly. This rarely works because developers must now be both good programers and good writers. The end product is a list and narrative at the same time.

This addresses code additions/extensions, otherwise ignored. This historical view helps onboarding (explaining the “why?"). Comments, specifications and good documentation serve well instead.

General Principles

At this point, it’s painfully obvious everyone’s stretching and struggling to describe the same thing (some better, some worse). It all blurs together. Everything and everyone tries to:

  • separation of concerns: separating business logic from external dependencies and infrastructure
  • make dependencies explicit
  • decouple (is also testable) Why should it matter if data’s stored on Postgres, Redis or CockroachDB?
    • single responsibility
  • increase the purity of the system
  • stay comprehensible, reduce surprise: so a human mind can comprehend it
    • we do this by constraining possibilities, separating concerns not combining them
  • element of least surprise
  • Things that change together should live together
    • DRY - rule of 3, only refactor if reused at least 3+ times
    • DRY - rule of 3, only deduplicate if it happens at least 3 times. Even then, beware of coupling

Even here, smaller systems (the size of your average SaaS codebase) can couple with no problem. It’s only an issue when they couple sequentially or between different service’s storage layers. The data model is essential complexity, coupling higher and lower layers together. Cohesion is worth it.

Different applications demand different architectures. Til now, it’s all been web.

  • compilers need stages in a pipeline
  • library are public APIs of interfaces, funcs, data structures, privately implemented

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Kernighan

In software engineering, “clever” is an accusation. The same applies to refactoring and architecture.

Key Organizational Principles

From: https://lobste.rs/s/vlcw5q/lessons_learned#c_ke5i1v

Don’t try to break dependent things into subparts, like say put the front and backends for the same system into two different repos. People might decompose by language, for example, but really if there is dependency, like an API, that matters more. It’s hard to explain, but if things can’t stand on their own, then you shouldn’t try to force them to.

Disorganization will always be the biggest problem. Organization is a place for everything, everything in its place, and not too many similar things in the same place. That is, if you have something new and you don’t know where to put it, then it is disorganized. It must go somewhere; if that is obvious, then you are okay.

Vocabulary - what we have so far:

  • interface: contract, the boundary between, port, modular parts connect together through these like plug types
  • implementation: satisfies the interface, adaptor
  • repository: touches DB etc.
  • domain: core business logic, best if functional, should be abstract interfaces
  • high and low: general and implementation-specific

“One’s good, two’s bad” is the essence of good design. Immutability: one way flow of state change. Redux/React: one way flow of data. CQS: single purpose, single pass communication. SRP: do one thing. OCP: one direction of code extension.

Ports & Adapters: one direction flow of design/abstractions. Design flows in, abstractions flow outward (defined in the core).

There are three types of code, specific to:

  • tech stack: HTTP framework, Postgres, cloud SDK or implementation
    • driver: HTTP framework triggering workflows, starts things
    • driven: what gets triggered (by the business logic)
  • application: MVC’s controller
  • business: aggregates, entities, value types

App and business logic sit in the core, going from low/specific to high/generic.


To Add

I hope to address these later: Some address different scopes, unsure how to square that circle.

  • cell based architecture CBA
    • a cell’s a shard of a big service, making a max size
    • split workload so failures only ruin sections
    • helps migrate
  • Command Query Responsibility Segregation
    • separate read and writing concerns (separate DB or just DB contexts)
    • big in .net
    • too many folders, one per operation
  • Functional Reactive Programming
    • handle events declaratively (what to do, not how)
    • compose components/functions (unix style)
    • hard to debug
  • Event Sourcing
    • Massive overhead and complexity All universal fanaticism bites, whether everything’s a file, object, event, func…
    • https://lobste.`rs/s/uahnku/1_year_event_sourcing_cqrs#c_gywlob
  • Service Oriented Architecture
  • Component-Based Architecture:
  • Pipeline Architecture
    • linear processing (like a compiler’s steps)
    • focus on data flows
  • Space based architecture
    • share memory Cf. ‘share memory by communicating’
    • components read/write message tuples in tuple space (repository)
    • how different from message broker/bus/event driven?
    • sharded/partitioned data
    • in memory data grids (IMDG) to reduce DB access
    • https://github.com/pSpaces/Programming-with-Spaces
  • Message-Oriented Middleware
  • functional
  • 99 Bottles of OOP