I don't know if the book goes into this, but one thing I've noticed about brilliant software people like Jeff Dean or Martin Thompson is that they appear to love doing simple "back-of-the-envelope calculations" (and lots of them).
They can iterate designs rapidly in their head or on paper, evaluate the cost or value of a design, know when a design has promise or when it won't work, and where the bottlenecks or low-hanging fruit might be, because they have a good feel for the performance/scaling cost of all the resources involved along the axes of things like branch mispredicts, main memory references, per core memory bandwidth, disk seeks, disk bandwidth, network latency, network bandwidth, expensive memory vs inexpensive storage tradeoffs. Numbers that programmers should know [1] but often don't.
In other words, before they even write specifications or start coding, they've probably done tens to hundreds of back-of-the-envelope calculations to put their design in the "roughly right" ballpark. This drastically increase the chances that the design is going to work, and work really really well. And I think this habit is also what tends to set them apart.
There's a monthly newsletter dedicated to napkin math. He hasn't made one in a few months for some reason, but judging from his tweets he's working on a new one. You can also practice on the old ones of course. https://sirupsen.com/napkin/
One other very similar path is to start with domain modeling. Figure out what all of the facts, types and relations are in abstract terms before you write a single line of code. Excel is exceedingly capable at this type of design work.
I see abstract domain modeling combined with practical engineering constraint solving as the foundations of the discipline. Being able to simultaneously operate in abstract academic terms (e.g. database normalization) while also considering things like cache lines and latency between networked participants is how the magic is done.
"(e.g. database normalization) while also considering things like cache lines and latency between networked participants is how the magic is done."
I wouldn't put abstract domain modeling or database normalization alongside a first principles back-of-the-envelope approach.
Abstract academic terms are certainly useful insofar as they provide a shared language to discuss systems and discover research, but... it really does actually seem to be the first principles things, the physical facts, like cache lines and latency ("practical engineering constraint solving" as you put it well) that drive the best designs, and far ahead of things like database normalization.
This is also the approach taken by Data-oriented Design [1], which is more similar I think, than the abstract domain modeling approach. In fact, abstractions and domain modeling principles can often lead one away from the ground reality of the machine, or "mechanical sympathy".
> Abstract academic terms are certainly useful insofar as they provide a shared language to discuss systems and discover research, but... it really does actually seem to be the "first principles" things like cache lines and latency ("practical engineering constraint solving") that drive the best designs, and far ahead of things like "database normalization".
Database normalization is a tool to deal with correctness (avoiding data anomalirs) and flexibility, cache lines and latency deal with performance. Fast but wrong and/or brittle designs are no more “best” in general than correct and flexible but poorly-performing ones; though the exact balance between those two things differs between applications.
Sure, and I'm not saying that something like database normalization is not useful.
I just don't think that the "Habits of Expert Software Designers" like Jeff Dean or Martin Thompson are to be found in how they would design normalized SQL schemas when asked [1].
Rather, it's in how they would build things like LSM-Trees or Replicated State Machines. And I think if one looks at that design process, one would find that they reach for back-of-the-envelope calculations far more often than most other tools. They appear to take a first principles approach.
Again, that doesn't mean that database normalization is not a useful tool. I just wouldn't put any kind of abstract modeling approach alongside a first principles approach when it comes to systems design.
[1] Regarding normalized SQL schemas, I wouldn't be surprised if people like Jeff Dean or Martin Thompson went denormalized where it made more sense, or eschewed database indexes in favor of learned indexes, to optimize not only for performance but also for cost or time to recovery etc.
SQL is just one tool/representation. I would focus more on the idea of "managing complexity" more than any specific technological implementation of this idea. This gets me back to Excel. Really hard to get opinionated about a computerized spreadsheet representation.
> Abstract academic terms are certainly useful insofar as they provide a shared language to discuss systems and discover research, but... it really does actually seem to be the first principles things, the physical facts, like cache lines and latency ("practical engineering constraint solving" as you put it well) that drive the best designs, and far ahead of things like database normalization.
If you have a problem domain that deals with thousands (or more) of types/facts/relations, you will quickly find that starting with code and how tightly you can pack your data in memory to be a massive mistake. The computer is arguably the most important stakeholder in all of this design work, but it is not the only critical one for most businesses. Being able to express a complex state machine to a domain expert who doesn't know what visual studio looks like is the biggest challenge I run into in the real world.
"Correct" is more important than "fast" every single day of the week where I work.
"If you have a problem domain that deals with thousands (or more) of types/facts/relations, you will quickly find that starting with code and how tightly you can pack your data in memory to be a massive mistake."
To be fair, this depends entirely on the business requirements of the software being designed. For example, a safety critical airbag deploy mechanism may need to make hard real time decisions so that memory layout (let alone dynamic allocation) is anything but "a massive mistake", or a file system may need to account for the spatial locality of corruption by writing superblock headers to different locations on disk.
I was also thinking of software design, not business applications that can be solved with SQL, and specifically in the context of infrastructure systems designed by experts such as Jeff Dean or Martin Thompson.
One worksheet per table you think you would want to create in an ideal SQL database representing the business domain. For instance:
Worksheet #1: Customers
ColA: Id
ColB: Name
ColC: Email
...
Worksheet #2: Orders
ColA: Id
ColB: CustomerId
ColC: ItemId
ColD: OrderDateUtc
...
You probably want to fill in a few example rows per table to illustrate intent and how things relate together.
The advantages with this approach are impossible to overstate. Being able to send a dead-simple spreadsheet via email to your domain experts can save a ton of frustration and meetings.
You can also implement rudimentary logic strictly with Excel functions, no need for VB, to create a kind of MVP/prototype of the system. Then it can be migrated to a real database or other representation format and program once you have a solid idea of what you're representing and why. And since it's data-centric there's less of a tendency to screw up by mixing the data layer and logic layer too much within this prototype, you're almost forced to create a clean separation. Then you have a data model that you can use in the core of a variety of applications with various business and view logics distinct or common between each.
Also, experts do not let the user design the software. The user will frequently want to tell you what the software "must" be doing. An expert will disregard this and will want to first understand the domain and then work with the user to find a good mapping to a useful interface.
2) Experts design elegant abstractions
I don't agree on this. Oftentimes, I have seen "experts" use solutions because they seemed elegant, intelligent, "nifty". I did that myself.
An expert will design simple abstractions that let the job done with little fuss. The software needs to be as simple as possible but not simpler than is needed.
Replace "elegant" with "simple", and it sounds about right.
An expert software looks dumb and simple, not necessarily super elegant and intelligent. Also, bad software can look dumb and simple.
3) Experts focus on the essence
Experts start by focusing on the essence. Then verify by testing how the distilled essence is useful at solving entirety of the problem.
The difference between elegance and simplicity is when you start to talk to developers who learned just enough to be able to put patterns in their projects but not enough to know when to do so.
> 3) Beginners and weak insecure developers focus on composition. Experts focus on the end state.
That is one more way to say that experts use programming as a tool and that programming is not a problem for them and so the biggest issue they see is that the end result is the right one.
It is easier to focus on the end goal when you trust you have some kind of solution for any problem that can happen on the way.
On the other hand novice and intermediate developers focus on technical because this is the challenge they are facing. And they don't yet feel they can solve every problem they will face. You can't tell them to not focus on technical because it is useless advice -- they need to learn technical first before they can become experts and focus fully on the end goal.
The best you can do is to remind that the end goal is important and keep it on the back of their heads even when they are immersed in technical challenges.
**
As to "insecure" developers, I think there is something to it. Moving from purely technical problems to other kinds of problems (looking at big picture of the product, the client and the development team) requires a little bit of courage (don't laugh). It is easy to keep working within the same types of problems that you are comfortable with, and make illusion of progress by changing technologies, working with larger applications and so on.
I had a prospective client some time ago. They wanted me to help with their application. They had trouble delivering and additionally their application was unreliable.
So on the meeting with the director, architect and tech lead they asked me to start by upgrading Java from 6 to 11.
Mind, that this is discussion with a director that had some 40 devs and reached out to me personally to get help.
So I asked "Guys, do you really want to say that people were not able to deliver reliable application with Java 6? Or maybe the problems are somewhere else?"
I think that experts design what the user needs, not what the user says that they want. Real experts have enough experience, wisdom, and listening skills to get this right; non-expert-but-wanna-bees have enough arrogance to think that they think they get it right, but they don't.
I have been asked to create a payment gateway for a company. We found an elegant solution that consisted of general components that we could just configure and wire together so that we will never need to write another payment gateway.
This was a mistake and we found it on next project that involved adding new payment method.
We found that yes, it could be rewired and configured without any development at all. But the process to do so took so much effort, that we would just be able to write a new payment gateway in less time.
The simpler (but less elegant) solution would be to just write the minimum code necessary (of course still neatly and nicely), and then implement the other payment gateway separately. We should have then extracted common model and infrastructure until there is nothing left and we again are left with a single, coherent application.
We have prematurely optimized, put a lot of upfront effort for unknown benefit. The benefit happened to be negligible and the resulting code was complex and costly to work with.
But sure it looked elegant.
Now that I am wiser for this lesson I am trying to use patterns to produce simple dumb code that models the problem well enough without details that are not necessary for the particular instance of the problem.
If I understand correctly you exposed an interface that was too general (AKA low level) for the user's semantics. It seems to me that this interface was fitting for you as the developers to write new features on top of though?
In other words, if with think in terms of simplicity then maybe the right approach was to have the general interface as your abstraction layer and on the higher layer you have something that is semantically close to what your users understand. The general layer would be simple as I understand it and provide the building blocks for the more specifically designed user interface.
I don't consider myself as experts compare to some of the big names, but my habits have served me well.
1. Don't assume anything. Go in with an open mind and blank canvas. I can't tell you how many times I've been in design meetings where engineers assume they know how things are gonna work, before clients even open their mouths. I can't tell you how many times clients assume they know how technology works before we explain to them.
2. Iterate and validate as fast as you can within reason. Don't do waterfall, at least not at the beginning. Design, build and validate, or even just do napkin design and validate. It'll save you a lot of headaches.
3. Perfect is the enemy of good. When you reach a certain comfort level with your software, ship it. You'll never be able to clean out your tech backlog or bug list.
#3 is very true, but where is the line? If your website crashes on every request, then it's clearly not ready. If it crashes only on Thursdays when Carl uploads that one spreadsheet, it depends how important Carl's spreadsheet is.
The answer is always, it depends. When SLA requirements are nonexistent, or self-defined, what is good enough?
If Carl is uploading financial reports or payrolls for the CXO every month TWICE, then I'll just make sure I have a script that can handle that file unless the fix is easy.
If you have SLA, that's easy, ship as soon as you hit that SLA. If you don't, then onboard clients very carefully.
Not specifically software design, but insofar related books that take the perspective of programmers with unique insights:
- Coders at Work (Seibel)
- Working in Public (Eghbal)
The first one is very entertaining. Read it a couple years ago and found it gives some valuable perspective. The second one is on my reading list, it was recommended around these boards.
Related to software design, there are many. The two that are on my recent list are:
- Software Design for Flexibility (Sussman, Hanson)
- A Philosophy of Software Design (Ousterhout)
I can't comment personally on their content yet, still have to work through those two, but I have zero doubts to learn something valuable. Certainly consider them.
Two books about to be published in this genre which I am excited about:
(1) Software Development Pearls: Lessons from Fifty Years of Software Experience (Karl Weigers). I enjoyed his books on Requirements and Software Engineering Culture
This zinger really applies to my work place "Lesson #7. The cost of recording knowledge is small compared to the cost of acquiring knowledge"
(2) Code That Fits in Your Head: Heuristics for Software Engineering (Mark Seemann). His previous book on Dependency Injection was good
I was skeptic due to the title at first, expecting generic (yet meaningless) "habits" like "reply emails as soon as they read them" but was positively surprised. Great article.
I'm really digging these days to solidify my foundation of software development so I thought this link might contain some useful information. Perhaps it does, but it seems a little abstract.
Being said, I'd rather get deep into the weeds with some technical summer reading. I'd love to get some recommendations from everyone here if possible.
Thanks for the response. Right now I'm focused on Javascript professionally (NodeJS for the backend and React on the frontend). I'd say my long term goals are to give back to the community through open source in some capacity.
In particular, I'd like to contribute to a library or a framework so I guess that would mean understanding the reasoning and the fundaments of why/how the library was created. Whether that means understanding design patterns of Javascript specifically or software fundamentals, I'm not sure.
Sorry if this still sounds too broad, but I'm growing quite bored of working as a typical developer and need a bit of guidance.
From experience with staff of different levels of expertise, the points seem to hold true, e.g.:
Reaching out to other people - generally true that staff w. greater expertise are more likely and comfortable doing so;
Form elegant abstractions - also true: experts more likely than newbies to be able to find a simple way to describe something that is built on sufficient complexity to cause at least some bafflement.
Also, for staff of a similar degree of experience and role, I have a greater degree of confidence in the outputs produced by those who can answer more searching questions, showing they have drilled down to the fundamentals. I suspect that this type of person is more likely to develop into a real expert than those who adopt the how without caring about the why. Experts may be better at ensuring that the substance of an objective is delivered rather than just the form.
Seconded. Specifically, roasting marshmallows. Seriously, if you haven't looked at the article, just scroll down to see that picture.
(You wouldn't want to do this in real life, because a burning PC will give off toxic gasses, and you don't want to eat marshmallows after that. But I thought that the picture itself was hilarious.)
I'm curious if any of the expert software designers use mathematical principles to verify their designs or model checkers/analysis to explore their designs.
I find there's a lot of informal folk-lore when it comes to "solid" software design. Asking experts is likely to surface answers as numerous as the stars in the visible universe. There may be common threads in popular discourse but it seems only to be buoyed by personality: a popular project lead or evangelist discusses their methodology and a constellation of developers grab on to it in order to emulate their success. Others find harbour elsewhere.
And yet very few of them formally state how their approach works to solve real engineering problems. The gang of four book is a book of patterns. The book itself doesn't contain any formal analysis of these patterns. Even among these patterns there is a world of contention about which ones work, which ones are less useful, etc, etc. A beginner into this world, typically an intermediate to advanced programmer with a few years experience under their belt, hits this wall and is left to flounder and find their tribe.
I liken it to beliefs about the efficacy of software engineering practices. There's far too little evidence about them to make any strong claims to their effectiveness. And yet people have built careers around evangelizing certain ones.
(And often when you ask people to be more forthcoming and give their formal definitions they wave their hands and say that software development is art not engineering and that it's about craftsmanship and beauty not cold, hard, unfeeling things like calculi, categories, or algebras)
I have little faith these days. Unless someone can formally describe their abstraction it's neither simple nor elegant to me. Such terms are qualitative in the absence of formal definitions. What is simple to you might be a hodge-podge to me. Code sans sensible abstractions is procedural spaghetti in my opinion: simple to understand in small, local parts, but impossible to comprehend as a whole. But if you can explain to me what the objects are, the operations on them and what they mean, and the algebraic properties of those operations then I'm much more certain about what we're discussing and can agree that given such definitions whether something is simple or elegant. What programming language the system is ultimately implemented in rarely matters as much as getting the design right.
Update: I'm curious because while I find these tools useful I often feel like they're under-appreciated and perhaps hardly used outside of certain small circles in practice.
> (And often when you ask people to be more forthcoming and give their formal definitions they wave their hands and say that software development is art not engineering and that it's about craftsmanship and beauty not cold, hard, unfeeling things like calculi, categories, or algebras)
I find it interesting that anyone would see these as mutually exclusive.
They can iterate designs rapidly in their head or on paper, evaluate the cost or value of a design, know when a design has promise or when it won't work, and where the bottlenecks or low-hanging fruit might be, because they have a good feel for the performance/scaling cost of all the resources involved along the axes of things like branch mispredicts, main memory references, per core memory bandwidth, disk seeks, disk bandwidth, network latency, network bandwidth, expensive memory vs inexpensive storage tradeoffs. Numbers that programmers should know [1] but often don't.
In other words, before they even write specifications or start coding, they've probably done tens to hundreds of back-of-the-envelope calculations to put their design in the "roughly right" ballpark. This drastically increase the chances that the design is going to work, and work really really well. And I think this habit is also what tends to set them apart.
[1] https://colin-scott.github.io/personal_website/research/inte...