Menu

Error budgets and the legacy of Herbert Heinrich

This is an older post from Lorin Hochstein but it’s new to me, and really insightful. It’s about how to best use our knowledge about the past behavior of a software system to figure out where we should invest our time to improve the system—and how the common method of error budgets is generally not a good way to do this:

I’m skeptical about relying on predefined metrics, such as reliability, for getting insight into the risks of the system that could lead to big incidents. Instead, I prefer to focus on signals, which are not predefined metrics but rather some kind of information that has caught your attention that suggests that there’s some aspect of your system that you should dig into a little more.

So basically, vibe-based incident analysis is where it’s at.

How GitHub Engineering communicates

This is a great document outlining the communication principles followed by GitHub Engineering. I’d say this is broadly applicable to teams and organizations—not just Engineering. I love this point about making work visible:

Capturing and exposing processes through URLs also helps make your work more visible. So work in the open and proactively share your work to the widest extent practical. As we continue to grow as an organization, points of collaboration will become even more important as we try to reduce redundant work. Avoid hoarding information: Like in any production system, observability is key. And if you make something useful, find a way to make it available so others can benefit from it too.

Ask Teresa: My Leaders Still Want Roadmaps with Timelines—What Should I Do?

Good points here from Teresa Torres about deadline-driven development, especially the need to take change management slowly:

If your stakeholders are insisting you use date-based roadmaps, I wouldn’t engage in the ideological war about deadlines and predictable work. Instead, start with a feature-based roadmap. Give your stakeholders what they are asking for, and over time, you can introduce opportunities and outcomes.

What to do when everyone’s eyebrows are glowing

Some great advice here on what to do when teams stop talking to each other. Starting with why it’s a big problem when that happens:

Teams that don’t talk to each other outside of transactional topics are barely teams at all. High-trust, high-engagement teams outperform, and those teams live and die on their ability to talk to each other. If that’s broken, your team is broken.

“Healthy tension” between Product and Engineering? No thanks, I’d prefer alignment.

I’ve always been adamant that Product and Engineering are in a partnership, not a “healthy tension” relationship. So I very much agree with this post:

The problem is in the assumption that Product and Engineering teams inherently have different goals. They don’t. Both teams are responsible for the growth and stability of the company, for revenue and scalability. Neither can succeed without the other. When we assume otherwise, we sell each side short.

How to Scale Yourself Down

How to Scale Yourself Down has some really interesting advice on how to go from leading a team at a bigger company to rolling up your sleeves at a startup. A couple of my favorite quotes:

Avoid process out of practice. Leaders who are successful in a startup are the ones who naturally reinvent their own toolboxes, and question what the process is trying to accomplish before establishing something that might be too heavyweight.

However, process is a double-sided coin. “There’s often an overcorrection when leaders move from big companies to small startups. Folks want to shake off that big company feeling and run hard in the other direction. And while the idea of no process sounds fantastic, issues emerge if you don’t start adding at least a little bit of it early on,” he says.

And:

To me, a well-made decision is one that you can explain how and why it was made. Ingraining this in the culture early on will support transparency as the company grows, promote consistency, and reduce politics. In essence, ‘don’t blame me, blame the framework’.

Improving work relationships using the lens of “The 9x Effect”

There’s a concept in UX design that I’ve been thinking about a lot in the context of interpersonal work relationships. It’s called “The 9x Effect” and I wrote about it… checks watch… 10 years ago. In short (and heavily simplified), customers value their existing solution/product 3x more than any new “innovation”, and companies overvalue their innovative new product by 3x of what’s currently in the market. So you end up with a 9x mismatch between what companies build and what people believe they need.

There’s another adage that when someone cuts you off in traffic they’re a jerk, but if you cut someone off you had a good reason. We tend to rationalize our own actions while not giving others the benefit of the doubt.

So I’ve been thinking about this in the context of competence at work. I wonder if we sometimes overvalue our own competencies by 3x, and undervalue others’ skills by 3x[1]. And I wonder how that affects the efficiency and health of organizations. We all have a tendency—especially in large organizations—to disagree with strategy, or at the more extreme end of the spectrum, view leadership as “inept” or “clueless”. And I wonder if it’s because of the 9x effect, and if we can all just divide our own opinions by 3 things would get a lot better.

What might happen if as employees we go “well maybe the way I think it should be done is only ⅓ of the answer”. And what if, in turn, leaders go “maybe the way I think things should be done is only a ⅓ of the answer.” Would we be able to come together in the middle and make better decisions together, and in doing so massively improve a company’s culture, autonomy, and efficiency? Sorry, I don’t mean to be a vague question-talker with this post, but I am genuinely curious about this.

A little more critique of ourselves, a little more grace for others… I think I’d like to try that.


  1. Yes, I’m very familiar with the Dunning-Kruger effect. What I’m talking about here is a bit broader and through a different lens.  ↩

Advice for new hires

I came across a couple of really helpful articles recently about how to start a new job well. 30 Tips for New Startup Employees is a long and super useful read—and not just relevant for startups:

Align yourself with the risks of the company. If you’re an engineer but the company is not acquiring customers fast enough, spend your time in marketing. Have range, and don’t try to be too narrow in your focus in the early days. Gain knowledge in a few different areas of the business so you can reduce the overall risks of the company.

Learn How The System Breaks is more relevant to technical roles, but I think “failure streams” can be expanded to other areas of the business as well:

Failure streams are a short circuit to understanding the system, because failures are where the system is interesting and nuanced. Failures are where the heart of complexity, entropy, and flux in the system are. Everything that doesn’t fail behaves like the architecture diagram. Failures show where the architecture isn’t working as intended. By focusing on failures, engineers can onboard quickly into the most important part of the system – the part with problems.

These are all great tips. The one I would add as most important for me personally is related to the concept of Chesterton’s fence:

In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, ‘I don’t see the use of this; let us clear it away.’ To which the more intelligent type of reformer will do well to answer: ‘If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.’

Or to put it in terms of systems thinking:

Before you disturb the system in any way, watch how it behaves. If it’s a piece of music or a whitewater rapid or a fluctuation in a commodity price, study its beat. If it’s a social system, watch it work. Learn its history. Ask people who’ve been around a long time to tell you what has happened. If possible, find or make a time graph of actual data from the system. Peoples’ memories are not always reliable when it comes to timing.

When you join a new organization you’re probably going to see a lot of random “fences across roads.” Instead of saying “let’s tear this thing down,” first ask “why is this fence here?” There is always a reason, and it’s very likely that there is value in the reasoning. First understand, then make change.

More

  1. 1
  2. ...
  3. 10
  4. 11
  5. 12
  6. 13
  7. 14
  8. ...
  9. 195